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INTRODUCTION 


The  ultimate  goal  of  this  project  is  to  combine  features  derived  from  ultrasound  (US)  images,  US 
radio  frequency  (RF)  data,  tissue  elasticity  imaging,  and  clinical  data  such  as  PSA  into  a 
computerized  system  for  displaying  prostate  images  that  indicate  probable  location(s)  of  cancer. 
Each  of  these  different  classes  of  features  has  been  shown  to  be  useful  for  prostate  cancer 
detection.  By  combining  those  features  in  each  class  that  perform  best  in  a  set  of  test  cases,  we 
hope  to  develop  an  accurate  tool  for  detecting  regions  on  the  ultrasound  image  that  a  high 
probability  for  cancer.  Eventually  we  hope  these  techniques  will  be  used  to  rapidly  identify  high 
probability  areas  and  mark  them  on  the  ultrasoimd  image  in  real  time  or  near  real  time. 

This  project  began  by  gathering  RF  data  from  in-vitro  prostatectomy  specimens  in  cross  sectional 
planes  2mm  apart  using  a  linear  array  transducer.  These  data  are  used  to  calculate  RF  features 
such  as  power  spectrum  slope,  and  backscatter  coefficient  at  each  location  in  the  gland.  The  data 
are  also  used  to  generate  images  and  elastograms  from  which  image  texture  features  and  tissue 
hardness  features  are  computed.  The  features  will  be  correlated  with  histology  taken  at  the  same 
tissue  planes  to  determine  which  features  and  feature  combinations  most  accurately  predict  the 
presence  of  cancer.  The  various  image,  hardness,  and  RF  features  will  then  be  combined  with 
prior  probability  information  derived  from  an  AFIP  3D  model  of  prostate  occurrence  and  with 
clinical  PSA  values  to  produce  a  system  that  can  accurately  identify  the  presence  of  prostate 
cancer  using  ultrasound  data. 

After  developing  the  techniques  to  perform  identification  of  prostate  cancer  using  the  linear  array 
scans,  our  plan  is  to  migrate  the  technique  to  data  from  a  curved  array  transducer  and  then  finally 
to  data  from  an  endorectal  prostate  probe.  We  hope  in  the  end  to  be  able  to  demonstrate  an  in 
vitro  system  using  an  endorectal  prostate  probe  that  will  be  able  to  mark  areas  of  high  probability 
for  cancer  on  each  ultrasound  image.  This  will  prepare  us  for  an  in  vivo  study  directed  at 
developing  an  ultrasound  system  that  can  better  direct  biopsies  of  the  prostate  gland  to  areas  of 
high  likelihood  for  actual  prostate  cancer. 
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RESEARCH  ACTIVITIES  AND  PROGRESS 


Administrative  Overview: 

Our  efforts  in  the  third  year  of  the  project  have  been  focused  on  continuing  the  clinical  data 
acquisition  begun  in  June  1999,  continuing  to  work  with  Mr.  He,  the  graduate  student  to  refine  the 
software  to  compute  the  ultrasound  based  features,  and  on  completing  development  of  a  system  for 
correlating  ultrasound  features  with  pathology  on  “whole  mount  equivalent  sections”  made  by 
reassembling  pathology  slide  sections.  We  have  succeeded  in  developing  this  system  and 
preliminary  results  were  reported  in  June  2001. 

The  graduate  student  on  the  project,  Mr.  Xhe  He  continued  working  on  the  project  and  completed  a 
usable  version  of  software  for  ultrasound  data  analysis  by  May  2001 .  Preliminary  analysis  was 
carried  out  using  this  software  and  then  Mr.  He  took  some  time  out  to  write  his  masters  thesis  based 
on  the  work,  which  was  completed  in  September  2001  and  accepted  by  the  graduate  college  in 
October  2001 .  Mr.  He  received  his  masters  degree  in  October  for  the  work.  Training  of  Mr.  He  has 
continued  with  numerous  software  refinements  currently  underway.  The  main  goal  for  the  next 
version  is  to  include  elastographic  features  in  the  analysis  and  to  modify  the  manner  in  which  user 
selected  regions  of  interest  are  selected.  Mr.  He  has  tentatively  agreed  to  continue  on  in  his  studies 
towards  a  Ph.D.,  which  means  that  he  will  continue  to  work  on  the  prostate  project.  This  eliminates 
the  need  to  train  a  new  graduate  student. 

Development  of  a  user-friendly  interface  for  the  software  has  consumed  a  significant  amount  of  Mr. 
He’s  time  forcing  him  to  devote  less  time  to  the  critical  questions  of  ultrasonic  feature  computation 
and  software  testing.  To  assist  with  these  software  development  issues,  a  programmer  has  been 
hired  on  a  part  time  basis.  Mr.  Steven  Felker,  the  programmer,  has  worked  with  the  ultrasound 
research  group  as  a  senior  computer  science  major  and  has  developed  considerable  familiarity  with 
Matlab  programming.  He  will  assist  with  file  conversion  software,  and  graphical  user  interface 
software  development  to  allow  Mr.  He  to  focus  more  on  feature  computation  and  data  fusion  issues. 

The  complex  process  of  combining  the  quarter  section  pathology  slide  images  into  the  equivalent  of 
whole  mount  sections  for  comparison  with  the  ultrasound  images  and  data  has  been  handled  in  the 
past  year  by  the  research  assistant  Gorana  Skjlarevski.  She  was  trained  in  this  process  by  Dr.  Mark 
Tuthill  of  the  Department  of  Pathology  and  became  quite  proficient  at  scanning  microscope  slides, 
rearranging  them,  labeling  the  resultant  image  files  in  an  organized  way,  and  combining  them  into 
complete  cross  sectional  images  of  the  prostate  gland.  These  cross  sectional  images  were  then 
placed  into  a  database  for  use  by  the  ultrasound  analysis  software  developed  by  Mr.  He. 
Unfortunately  Ms.  Skjlarevski  left  the  project  suddenly  in  June  2001  after  her  husband  took  a  job  in 
another  city.  This  brought  to  a  halt  both  ultrasound  data  acquisition  and  pathology  image 
processing. 

A  search  for  a  replacement  was  instituted  and  in  September  2001,  Mr.  Steven  Knight  was  hired. 

The  principal  investigator  trained  Mr.  Knight  in  the  ultrasound  data  acquisition  from  prostatectomy 
specimens  over  a  four-week  period  and  Mr.  Knight  also  received  training  from  Dr.  Tuthill  on 
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pathology  image  reassembly.  Unfortunately,  after  the  training  period,  Mr.  Knight  performed  only 
two  ultrasound  acquisitions  in  two  months  and  performed  no  pathology  image  assembly.  It  was 
clear  that  because  of  workload  and  personal  problems  that  Mr.  Knight  could  not  perform  the  jobs 
expected  of  him  so  he  was  asked  to  resign  in  late  November  and  tendered  his  resignation  shortly 
thereafter. 

A  search  is  underway  for  a  replacement  and  during  the  interim  period,  the  P.I.  will  perform 
ultrasound  data  acquisition  and  pathology  image  assembly  as  time  permits. 

In  summary,  the  first  half  of  the  year  was  very  productive  but  work  in  the  second  half  of  the  year 
was  hampered  by  the  loss  of  the  laboratory  assistant  and  the  failure  of  her  replacement  to  carry  on 
the  data  acquisition/image  processing  work.  Additional  computer  programming  expertise  has  been 
hired  to  speed  up  software  development  and  a  search  for  a  laboratory  assistant  continues.  As  we 
had  several  candidates  for  the  job  before  selecting  Mr.  Knight,  we  are  optimistic  about  hiring  a  new, 
more  reliable,  laboratory  assistant  in  the  very  near  future. 


RESEARCH  PROGRESS 

Task  1  (Months  1-6):  Collect  RF  data  on  25  prostate  glands  with  the  linear  array  transducer. 
Develop  a  preliminary  plan  for  data  acquisition  for  tasks  5  and  7. 

This  portion  of  the  project  was  completed  prior  to  the  1999  annual  report  and  is  outlined  in  that 
document.  No  further  changes  to  data  acquisition  were  made  in  the  past  year  other  than  a 
reduction  in  the  number  of  sutures  used  to  mark  the  index  slice  of  the  ultrasound  study.  This 
change  was  done  to  reduce  the  amount  of  time  that  the  specimen  spent  in  the  ultrasound  lab  prior 
to  being  received  by  pathology. 

Task  2  (months  1-6):  Develop  a  methodology  for  registering  optical  pathology  information  with 
ultrasound  data. 

The  procedure  outlined  in  the  previous  annual  report  was  successfully  implemented  as  outlined 
in  the  previous  report  with  only  minor  modifications.  The  procedure  now  consists  of  the 
following  steps: 

1 .  The  prostatectomy  specimen  is  fixed  in  formalin. 

2.  The  gland  is  sectioned  every  2-3mm  after  coating  the  surface  of  the  gland  with  inks  of 
various  colors  to  identify  anterior  and  posterior  surfaces 

3.  Each  2-3mm  thick  whole  cross  section  is  divided  into  quarters. 

4.  The  quarters  are  labeled  and  embedded  in  paraffin 

5.  The  embedded  quarters  are  sectioned  and  mounted  onto  glass  slides 

6.  The  slides  are  stained  and  examined  by  the  pathologist — ^Dr.  Trainer 

7.  Areas  of  cancer  are  marked  on  the  slides  in  indelible  ink 

8.  The  slides  are  digitized  by  placing  them  on  a  flatbed  scanner  and  scanning  at  300dpi. 

This  produces  pathology  images  of  high  enough  resolution  without  producing 
unnecessarily  large  image  files — see  figure  1 . 
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9.  The  slide  images  (including  identification  labels  indicating  the  original  slice  position  and 
quarter)  are  imported  into  Adobe  Photoshop  and  reassembled  into  complete  cross 
sections  (“whole  mount  equivalents”). 

10.  The  whole  mount  equivalent  images  are  placed  into  a  database  on  a  shared  disk  drive  for 
later  comparison  with  ultrasound  data. 

An  example  of  a  “whole  mount  equivalent”  image  assembled  from  quarter  sections  is  shown  in 
figure  2. 


Figure  1 .  Scanned  in  images  of  the  four  quarters  of  a  pathology  slice  before  reassembly 
into  a  complete  cross  section.  Areas  of  cancer  are  marked  in  with  blue  ink  outlines. 


Figure  2.  Quarter  sections  reassembled  into  complete 
cross-section  (whole  mount  equivalent).  Cancer  areas 
marked  by  blue  lines 
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A  few  modifications  to  the  original  procedure  outlined  in  the  prior  report  were  made.  No 
warping  of  pathology  images  was  performed  due  to  concerns  by  the  pathologists  that  such  a 
procedure  might  introduce  undesirable  distortions  into  the  pathology  data.  Thus  some  of  the 
whole  mount  equivalents  have  gaps  where  the  quarter  sections  did  not  fit  precisely  with  one 
another.  This  has  not  proved  to  be  a  problem  for  correlation  with  ultrasound  data.  When  looking 
for  benign  or  malignant  areas,  the  gaps  are  simply  avoided.  Software  to  perform  the  selection  a 
corresponding  pathology  whole  mount  equivalent  image  based  on  the  registration  scheme 
described  in  the  2000  report  has  been  developed  and  successfully  used.  See  task  4  description 
for  further  discussion. 

In  Summary,  a  method  for  registration  of  histologic  information  with  ultrasound  raw  data  has 
been  developed  and  is  in  use  in  other  phases  of  the  project.  Task  2  is  complete. 


Task  3  (months  1-6):  Use  digital  database  of  prostate  cancer  rate  developed  at  Georgetown 
University  and  AFIP  to  establish  a  probability  map  of  prostate  cancer  in  a  3D  domain. 

The  available  pathology  data  from  UVM  have  been  transferred  to  Georgetown  University  for 
probability  map  creation.  The  creation  of  the  probability  map  and  3D  distribution  mapping  has 
been  described  in  the  previous  report.  Although  the  probability  distribution  map  has  not  yet  been 
created,  it  is  not  needed  at  this  point  since  incorporation  of  prior  probabilities  is  needed  only  in 
the  final  phases  of  UNKNOWN  region  of  interest  classification  and  we  are  still  in  the  phase  of 
computing  features  for  KNOWN  regions  of  interest  to  determine  which  features  best 
discriminate  cancer  from  benign  tissue. 

Further  work  on  3D  modeling  at  UVM  has  been  put  on  hold  pending  hiring  of  a  replacement  for 
the  research  assistant. 


Task  4  (months  1-9):  Software  development.  Adapt  existing  RF  analysis  software  and 
incorporate  texture  analysis.  Develop  software  to  automatically  calculate  RF  and  texture  features 
over  multiple  subregions  in  an  image. 

Rather  than  adapt  existing  RF  analysis  software,  it  turned  out  to  be  more  educational  and 
expedient  to  develop  new  software  based  on  MATLAB  to  compute  both  RF  and  Texture 
features.  The  previously  reported  user  interface  (figure  3)  was  completely  revamped  to  give  the 
user  a  way  to  process  regions  of  interest  selected  from  a  pathology  image. 
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This  new  software,  which  is  designed  to  allow  a  user  to  identify  a  normal  or  cancerous  area  on 
the  pathology  image  and  find  the  corresponding  region  in  the  ultrasound  data  set,  has  been 
completed  and  used  successfully  to  analyze  ultrasound  data.  The  software  was  developed  in 
MATLAB  with  a  Windows  Graphical  User  Interface.  To  calculate  RF  or  texture  features  the  user 
selects  the  ultrasound  data  file  that  he/she  wishes  to  use.  The  software  automatically  selects  the 
pathology  slice  the  most  closely  corresponds  to  the  ultrasound  data  based  on  the  slice  correlation 
scheme  as  reported  in  the  previous  annual  report.  The  user  then  adjusts  the  size  of  the  pathology 
image  to  match  the  ultrasound  image  by  drawing  a  box  around  the  image  of  the  prostate  that 
touches  the  image  of  the  gland  on  all  four  sides.  The  pathology  image  is  then  oriented  to  match 
the  ultrasound  data — this  usually  involves  rotating  the  pathology  image  180  degrees.  The  user 
then  draws  are  region  of  interest  on  the  pathology  specimen  and  specifies  whether  it  is  a 
cancerous  or  benign  region.  The  software  automatically  finds  the  corresponding  region  on  the 
corresponding  ultrasound  image,  finds  the  raw  RF  data  corresponding  to  that  region,  and 
computes  RF  and  textures  features  from  the  RF  data  and  places  the  results  in  a  data  base.  It  is 
also  possible  to  draw  multiple  regions  and  then  have  the  software  process  the  RF  for  all  regions 
of  interest  at  a  later  time  as  a  batch  process.  Figure  4  shows  the  new  GUI  interface. 
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Figure  4.  New  Ultrasound  Data  Processing  Software  Interface.  Screen  capture  of 
the  interface  used  to  process  ultrasound  data  from  defined  regions.  The  pathology 
image  on  the  right  side  has  a  square  region  of  interest  drawn  (dotted  box)  and  a 


region  of  cancer.  The  corresponding  ultrasound  data  is  denoted  by  the  box  drawn 
automatically  on  the  ultrasound  image  (left). 


Of  course,  although  this  new  user  interface  was  designed  to  let  the  user  proeess  selected  regions 
of  interest  so  that  a  database  of  features  values  for  cancer  and  benign  tissue  could  be  generated, 
the  software  is  also  capable  of  processing  RP  data  from  entire  slices  or  multiple  slices, 
automatically  subdividing  the  data  into  subregions  and  calculating  features  for  those  regions. 

The  software  outlined  above  was  used  to  analyze  a  large  subset  of  our  acquired  data  to  verify 
correct  operation  of  the  software  and  to  begin  to  determine  the  most  useful  feature  combinations 
as  outlined  in  Task  7.  See  task  7  description  for  our  preliminary  results.  Based  on  these  results, 
one  important  modification  was  made  in  the  way  the  software  computes  features.  In  previous 
versions,  the  region  of  interest  size  was  variable  and  controlled  by  the  size  selected  by  the  user. 
Evidence  that  ROI  size  biases  the  feature  results  prompted  us  to  allow  uses  to  select  an  ROI  but 
features  are  computed  from  subregions  of  the  ROI  of  FIXED  size  to  eliminate  this  bias. 

In  Summary,  the  basie  parts  of  task  4  are  complete  but  ongoing  modification  continues  to 
incorporate  data  fusion  elements  (from  Task  5)  and  to  make  the  software  more  accurate,  robust 
and  convenient  to  use. 


Task  5  (months  12-18):  Data  Fusion 

Having  developed  the  software  to  use  pathology  images  to  seleet  data  for  RF  processing,  we 
began  the  process  of  development  of  data  fusion  software  to  combine  elastography  results  with 
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RF  results  in  user-selected  regions  of  interest.  Having  had  success,  with  an  interactive  scheme 
for  orienting  pathology  images  with  those  from  ultrasound  RF  data,  we  have  elected  to  initially 
use  the  same  approach  for  elastography,  processing  the  elastographic  data  separately  using 
software  from  the  University  of  Texas  and  combining  those  results  with  those  from  the  RF 
analysis.  We  have  completed  the  development  of  software  that  calculates  RF  and  texture 
features  from  data  corresponding  to  a  user  selected  region  on  a  pathology  image  AND  selects  the 
appropriate  region  from  the  corresponding  elastographic  image  placing  the  mean  strain  value 
from  the  elastogram  ROI  into  the  feature  database  (figure  5). 


Figure  5.  New  Interface  for  Combining  Elastography  with  RF  Analysis.  The  user  selects  a  region  of  interest 
(dotted  box)  in  the  pathology  image  and  the  software  automatically  selects  the  corresponding  region  from  the 
RF  data  (left  image)  and  elastogram  (right  image).  The  results  of  RF  analysis,  texture  analysis,  and  the 


elastographic  strain  are  all  placed  in  the  database  for  cancer  or  benign  depending  on  the  ROI  type  selected  in 
the  center  pull  down  menu. 


The  user  draws  boxes  around  each  image  to  inform  the  software  of  the  relative  sizes  of  the 
prostate  in  each  image  so  that  the  software  can  find  corresponding  regions  on  each  image.  This 
method  eliminates  problems  from  distortion  of  the  image  in  the  vertical  direction  that  can  occur 
in  elastography.  The  software  also  allows  the  user  to  adjust  the  display  of  the  elastogram  since 
the  elastographic  data  may  not  always  give  a  pleasing  image  without  grayscale  processing. 
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The  new  software  represents  a  significant  programming  change  in  that  the  tools  used  (Matlab 
Guide)  are  different  from  the  previous  version  of  software.  This  necessitated  a  significant 
rewrite  of  code  but  should  yield  benefits  in  the  future  should  further  modifications  to  the  user 
interface  be  required.  The  elastographic  data  are  currently  being  acquired  from  the  actual 
elastographic  image,  which  is  a  map  of  strain  values  windowed  into  256  shades  of  gray.  The 
next  version  may  use  the  unwindowed  strain  values.  At  a  still  later  time,  it  may  be  appropriate  to 
calculate  the  elastographic  data  directly  from  the  RF  after  the  region  of  interest  is  selected.  This 
issue  will  be  addressed  after  new  elastographic  routines  that  include  effective  lateral  motion 
correction  are  developed  with  the  help  of  Dr.  Konofagu  as  noted  below. 

One  issue  of  great  concern  is  the  quality  of  the  elastographic  data.  Since  the  prostate  glands  are 
not  embedded  in  gel  as  were  the  glands  scanned  by  other  investigators  in  animal  work,  there  is 
great  potential  for  lateral  decorrelation  which  increases  the  noise  in  elastograms  and  decreases 
the  contrast  between  benign  and  malignant  tissue.  Software  with  improved  ability  to  correct  for 
lateral  motion  was  expected  from  the  University  of  Texas  by  April  2001.  It  finally  arrived  in 
October  2001  but  has  not  performed  to  expectations  on  test  objects.  Elastography  software 
problems  and  personnel  problems  at  Texas  have  prompted  us  to  move  forward  with  the 
development  of  new  elastography  software  incorporating  high  quality  lateral  correction  on  our 
own.  I  have  enlisted  the  help  of  Eliza  Konofagu  (currently  a  postdoctoral  fellow  at  Harvard)  to 
head  the  development  effort  with  the  aid  of  graduate  students  at  the  University  of  Texas.  Steve 
Felker  will  coordinate  integration  of  the  new  elastography  software  with  our  RF  software 
developed  by  Mr.  He. 

Another  issue  with  elastography  that  was  mentioned  previously  is  the  problem  of  quantifying 
what  have  been  regarded  as  qualitative  images.  Our  plan  to  use  the  change  in  thictoess  of  the 
overlying  standoff  pad  as  a  means  of  normalizing  the  strain  values  has  been  hampered  by  the  fact 
that  many  of  the  elastograms  collected  did  not  include  enough  of  the  standoff  pad  to  measure  it 
accurately.  We  have  modified  the  data  acquisition  routine  to  eliminate  this  problem,  but  it  means 
that  some  additional  data  with  the  linear  array  alone  must  be  acquired  to  have  a  sufficient  sample 
of  cancers.  We  have  acquired  approximately  10  glands  using  the  modified  technique  and  must 
acquire  an  additional  1 0  to  15  more.  In  addition,  we  plan  to  test  the  normalization  routines  on  a 
phantom  test  object  containing  a  hard  inclusion  of  known  stiffness  relative  to  the  surrounding 
material.  This  object  is  under  construction  at  the  University  of  Wisconsin. 


Task  6  (months  7-18):  More  prostate  data  collection. 

As  mentioned  in  the  previous  section,  software  from  the  University  of  Texas  expected  to  allow 
acquisition  at  higher  compressions  for  higher  image  quality  did  not  meet  expectations  and 
because  of  personnel  problems  at  UT,  hopes  for  new  software  have  faded.  Thus  experiments 
using  larger  compressions  are  on  hold  pending  internal  development  of  new  elastography 
software  incorporating  lateral  correction  and  estimation  of  lateral  strains. 
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In  the  previous  report,  the  failure  of  our  primary  ultrasound  instrument  was  documented  as  were 
our  plans  should  the  instrument  prove  irreparable.  Luckily,  we  were  able  to  find  a  service 
engineer  with  experience  on  the  old  Diasonics  systems  and  in  February  2001,  the  Diasonics  unit 
was  repaired  and  became  operational  again.  By  this  time,  we  had  identified  some  problems  with 
the  RF  data  already  acquired  (such  as  the  absence  of  a  visible  overlying  standoff  on  some  cases 
and  saturation  of  the  A/D  converter  on  others)  and  decided  to  acquire  15-20  additional  prostate 
cases  using  the  linear  array  before  moving  on  to  linear  +  endorectal  curved  array  acquisitions. 

We  have  acquired  about  10  of  those  cases  despite  the  delay  caused  by  the  laboratory  technician’s 
departure.  We  hope  to  begin  test  acquisitions  with  the  curved  array  probe  in  January  or  February 
2002  with  or  without  a  replacement  laboratory  assistant.  The  PI  will  perform  the  acquisitions  but 
since  acquisition  with  both  a  linear  array  and  curved  array  will  require  upwards  of  two  hours — 
usually  during  busy  clinic  hours,  a  replacement  lab  assistant  is  hoped  for  to  reduce  conflicts 
between  acquisitions  and  the  clinical  responsibilities  of  the  PI. 


Task  7  (months  11-22):  Compute  RF  and  texture  features  for  all  stage  1  acquisitions. 

Computation  of  RF  and  Texture  Features  on  approximately  75%  of  the  existing  data  was 
completed  by  June  2001 .  Potentially  useful  features  were  identified  using  the  Mahalanobis 
distanee  as  an  index  of  the  usefulness  of  both  single  features  and  feature  combinations  for 
separating  benign  from  cancerous  tissue.  Figure  6  shows  the  experimental  setup  used: 


At  total  of  eight  different  RF  and 
texture  features  were  computed 
from  selected  regions  of  interest. 
Features  based  on  the  RF  data 
included  the  slope  of  backscatter 
vs.  frequency,  the  zero  frequency 
intercept  of  the  backscatter 
intensity,  and  the  mid  bandwidth 
value  for  backscatter*.  One  feature 
based  on  image  statistics  was 
computed,  this  was  the  image 
signal  to  noise  ratio  {\xlc5f.  Four 
image  texture  features  based  on  the 
co-occurrence  matrix  were 
computed:  angular  second 
moment,  entropy,  contrast,  and 
correlation^.  The  features  were 
computed  from  36  cancer  regions 
of  interest  and  19  benign  regions 
of  interest.  Table  1  shows  the 
mean  and  standard  deviations  for 

the  various  features. 


Figure  6;  Data  Acquisition  Setup.  The  prostate  gland  is  embedded 
in  toweling  and  flooded  with  normal  saline.  The  ultrasound  probe 
(long  black  arrow)  is  held  vertically  and  compresses  the  gland 
driven  by  a  computer  controlled  stepper  motor.  The  standoff  gel 
block  is  held  in  place  by  the  reddish  orange  plastic  block  (short 
black  arrow)  attached  to  an  aluminum  pressure  plate  attached  to  the 
transducer. 
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Table  1.  Feature  Values  for  Benign  and  Malignant  Prostatic  Tissue 


FEATURE 

CANCER  (Mean  ±  s.d.) 

BENIGN  (Meanl  s.d.) 

Slope 

0.778  ±  .348  dB/MHz 

.588  1 .326 

Intercept 

-11.81  ±2.19dB 

-10.41 12.20 

Mid  Band  Value 

-7.88  ±  1.48  dB 

-7.4211.57 

Signal  to  Noise  Ratio 

1.62  ±  .38 

1.361  .35 

Angular  Second  Moment 

.00991  .018 

.0029 1 .0026 

Entropy 

-5.41 1.04 

-6.421  .78 

Contrast 

4111 1  .38 

20661  1049 

Correlation 

-.74691.115 

-.8043  1 .0788 

The  Mahalanobis  distance  is  a  measure  of  the  statistical  distance  between  two  clusters  of  values 
relative  to  the  scatter  or  variance  of  those  values.  To  provide  good  discriminability  between 
benign  and  cancerous  prostate  tissue,  the  Mahalanobis  distance  should  be  maximized.  Table  2 
shows  the  Mahalanobis  distance  between  cancer  and  benign  tissue  using  linear  discriminant 
analysis  and  various  features  or  feature  combinations.  The  discriminant  analysis  was  performed 
using  Minitab  R13  software  using  the  leave-one-out  (cross-validation)  method.  This  method 
minimizes  the  optimistic  bias  that  results  from  using  the  same  data  for  both  training  and 
performance  estimation. 


As  is  usually  the  case,  more 
features  lead  to  larger  values  and 
greater  separation.  But  with  the 
limited  data  set  at  hand,  it  is 
appropriate  to  use  no  more  than 
2-3  features  to  avoid  an 
optimistically  biased  estimate  of 
performance  for  the  task  of 
separating  cancer  from  benign 
tissue.  Receiver  operating 
characteristic  analysis  was 
.  The  resulting  area  under  the  ROC 

During  the  analysis,  it  was  noted  that  the  size  of  the  region  of  interest  used  could  affect  the 
results — especially  for  SNR  and  the  texture  features.  Since  benign  regions  of  interest  tended  to 
be  larger  than  cancer  regions,  some  of  the  difference  in  features  could  be  the  result  of  ROI  size. 
To  eliminate  this  effect,  we  have  modified  the  software  so  that  regardless  of  the  size  of  ROI 
chosen  by  the  human  observer,  the  features  are  all  computed  from  sub  regions  of  identical  size 
(approx  RF  lines  wide).  For  a  large  region  of  interest,  more  subregions  are  present  but  this  no 
longer  affects  the  mean  value,  only  the  variance  and  Standard  Error  of  the  Mean.  We  have 
recomputed  all  features  using  this  technique  but  the  results  have  not  yet  been  analyzed.  A 


Table  2.  Feature  Performance  for  Cancer  vs.  Benign 

FEATURE  (S) 

MAHALANOBIS  DISTANCE 

Slope  &  Intercept 

0.418 

Intercept  &  Entropy 

1.351 

Slope  &  Entropy 

1.498 

Intercept  &  Contrast 

0.765 

Entropy  &  Contrast 

1.14 

Slope 

0.309 

Intercept 

0.403 

Signal  to  Noise,  Slope,  Entropy 

1.700 

applied  to  the  best  two-feature  combination,  slope  and  entropy 
curve  was  Az  =  .77,  far  from  ideal  but  still  encouraging. 
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disadvantage  of  this  new  method  is  that  computation  times  become  very  long.  We  are 
minimizing  this  by  acquiring  a  faster  PC  for  processing. 

In  summary,  we  have  completed  processing  of  a  considerable  portion  of  the  RF  data  and  the 
results  are  promising.  Using  the  knowledge  we  acquired  performing  that  processing,  we  are 
acquiring  some  additional  data  and  are  modifying  the  way  we  compute  the  features  to  remove 
bias  and  increase  reliability.  We  are  in  the  process  of  combining  the  elastography  feature  with 
the  RF,  image,  and  image  texture  features. 


Task  8  and  Task  9  (months  13-26  and  24-30):  Acquire  RF  with  a  curved  array  transducer. 

As  mentioned  in  Task  6,  collection  of  data  using  a  curved  array  transducer  will  begin  shortly. 

The  software  developed  for  linear  array  data  will  require  some  modification  so  that  the  correct 
region  is  obtained  from  the  ultrasound  data  once  the  pathology  image  region  of  interest  is 
selected.  This  problem  (one  of  scan  conversion — i.e.  polar  to  rectangular  coordinate  conversion) 
should  be  solvable  in  a  short  period  of  time.  Elastography  with  a  curved  array  has  already  been 
demonstrated  to  be  feasible,  but  the  problem  of  how  to  normalize  elastograms  to  correct  for  the 
non-uniform  stress  distribution  has  yet  to  be  solved.  One  method  is  to  use  a  modification  of  the 
method  we  plan  for  the  linear  array,  but  measure  standoff  distances  along  each  A-line  and 
perform  normalization  one  A-line  at  a  time.  This  method  will  be  tested  on  the  phantom  test 
object  currently  under  construction.  Since  the  Diasonics  scanner  is  once  again  operational,  there 
will  be  no  need  to  change  scanners  although  a  change  is  still  possible  should  the  Diasonics 
instrument  fail  or  should  the  Diasonics  endorectal  probe  (old  and  obsolescent)  prove  to  give  data 
of  inferior  quality. 

RESEARCH  ACCOMPLISHMENTS 

■  Software  has  been  developed  under  this  program  that  can  reliably  find  ultrasound  data 
corresponding  to  an  area  of  pathology  on  a  pathology  image.  This  technique  could  have 
broad  applicability  to  any  situation  where  in-vitro  scans  are  being  correlated  with 
pathology. 

■  Software  for  computing  both  RF  based  and  image  texture  based  features  for  prostatic 
tissue  or  any  other  tissue  has  been  developed  and  tested. 

■  Preliminary  analysis  shows  that  the  features  are  promising  and  suggest  that  RF  and 
texture  based  features  can  be  used  to  discriminate  between  cancer  and  benign  tissue. 
Discriminability  will  hopefully  be  further  enhanced  by  adding  elastography. 

■  Software  to  combine  elastographic  strain  data  with  RF  and  texture  features  has  been 
developed  and  is  being  refined. 

REPORTABLE  OUTCOMES 

1.  Database  of  completely  sectioned  prostate  glands  with  all  cancer  foci  located  plus 

correlated  ultrasound  raw  data.  This  is  a  valuable  resource  that  may  be  used  for  studies  of 
the  distribution  of  cancer  and  for  any  study  requiring  ultrasound  image  or  raw  data  that 
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can  be  precisely  correlated  with  histology.  The  combining  of  this  data  set  with  the  AFIP 
data  set  will  produce  a  larger  and  more  reliable  data  set  than  now  exists  for  estimation  of 
the  probability  of  cancer  as  a  function  of  location  in  the  gland. 

2.  Abstract:  He  Z,  Skljarevski  G,  Trainer  T,  Tuthill  JM,  Wagner  RF,  Huston  D,  Garra  BS. 

Classification  of  benign  and  malignant  prostate  tissue  using  radio  frequency  ultrasound 
data:  preliminary  results  of  in  vitro  studies  of  radical  prostatectomy  specimens.  Ultrasonic 
Imaging  2000;22:238. - see  Appendix  1. 

3.  Presentation:  He  Z,  Skljarevski  G,  Trainer  T,  Tuthill  JM,  Wagner  RF,  Huston  D,  Garra 

BS.  Classification  of  benign  and  malignant  prostate  tissue  using  radio  frequency 
ultrasound  data:  preliminary  results  of  in  vitro  studies  of  radical  prostatectomy 
specimens.  Presented  at  the  26th  International  Symposium  on  Ultrasonic  Imaging  and 
Tissue  Characterization,  Rosslyn,  VA,  May  31  2001 - See  Appendix  2 

4.  Thesis:  He,  Zhi,  Quantitative  Sonographic  Prostate  Cancer  Characterization,  Masters  of 

Science  Thesis,  October  2001  - See  Appendix  3 

5.  Masters  Degree  Awarded  October  2001  to  Zhi  He 

6.  Employment  &  Training  Supported  by  this  Project  Funding: 

a.  Masters  and  Doctoral  Training  by  Zhi  He 

b.  Part  time  Programmer:  Steven  Felker 

c.  Half  time  Research  Assistant:  TBN  (currently  vacant  position) 

CONCLUSIONS 

Despite  problems  caused  by  the  departure  of  the  research  assistant  and  continued  delays  in 
receiving  software  for  elastography  from  the  University  of  Texas,  considerable  progress  has  been 
made  in  the  past  year  with  completion  of  software  and  a  procedure  for  precisely  correlating 
pathology  and  ultrasound  data  acquired  in  vitro  with  approximately  2mm  spatial  accuracy.  These 
methods  and  the  software  could  be  applied  to  other  organs  with  equal  success. 

Our  preliminary  analysis  of  the  RF  data  suggest  that  cancerous  tissue  can  be  differentiated  from 
benign  tissue  using  RF  and  texture  features.  Some  of  the  RF  data  is  of  poor  quality  necessitating 
acquisition  of  some  additional  data — ^this  will  be  acquired  in  a  modified  fashion  that  will  allow 
for  normalization  of  the  elastographic  data  in  a  novel  manner  that  is  simple  but  robust  (unlike 
other  methods  that  have  been  reported  such  as  computation  of  the  elastic  modulus).  Additional 
personnel  have  been  hired  to  speed  up  software  development  and  incorporation  of  higher  quality 
elastography  strain  data  into  our  database. 

We  are  almost  ready  to  tackle  the  problem  of  using  a  curved  array  transducer  but  are  confident 
that  the  needed  modifications  can  be  made  to  our  software  so  that  it  will  work  properly  with  the 
new  transducer  array.  We  remain  confident  that  we  will  be  able  to  produce  probability  images 
showing  areas  likely  to  contain  cancer  using  the  combination  of  RF,  texture,  and  elastographic 
features.  This  has  the  potential  of  being  a  valuable  tool  for  clinicians  using  ultrasound  to  guide 
biopsies  AND  with  the  release  of  standard  clinical  ultrasound  machines  capable  of  storing  raw 
ultrasound  data  (such  as  the  new  General  Electric  Logiq  9),  it  will  be  something  that  can  be 
implemented  on  existing  commercial  hardware! 
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26th  International 
Symposium  on 

Ultrasonic  Imaging  and 
Tissue  Characterization 


May  30- June  1,  2001 

Holiday  Inn/Rosslyn  Westpark  Hotel 
Arlington,  VA 


The  annual  International  Symposia  on  Ultra¬ 
sonic  Imaging  and  Tissue  Characterization  have  long 
been  recognized  as  the  world’s  leading  forums  con¬ 
cerned  with  ultrasonic  techniques  for  medical  diagno¬ 
sis.  This  year,  sessions  will  be  devoted  to  Elasticity, 
Tissue  Parameters  and  Imaging/Doppler.  Forty-one  con¬ 
tributions  from  six  countries  will  be  presented  at  these 
sessions.  A  large  number  of  papers  will  deal  with  clini¬ 
cal  evaluation  of  novel  methodology  and  instrumenta¬ 
tion  for  tissue  characterization. 

A  special  one-day  session  on  Elasticity  will  take 
place  on  Wednesday,  May  30.  It  will  feature  fifteen  pa¬ 
pers  in  sessions  of  invited  and  contributed  papers  and 
will  conclude  with  an  hour-long  panel  discussion. 

The  abstracts  for  the  meeting  have  been  pub¬ 
lished  in  the  October  2000  issue  of  the  journal  Ultra¬ 
sonic  Imaging  (Dynamedia)  and  will  be  distributed  to 
the  attendees  at  the  time  of  the  meeting.  No  proceed¬ 
ings  will  be  published- 
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GOAL 

To  combine  ultrasound  RF,  eiastographic, 
image  texture,  and  clinical  features  to  more 
accurately  locate  regions  of  high  suspicion 
for  cancer  using  prostate  ultrasound. 


IN  VITRO  PROSTATE  TISSUE 
CLASSIFICATION  PROJECT 
METHODS  1 
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METHODS  2 
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Feature 

RESULTS 

CA  Mean  {±s.d.)  Benign  Mean  (±s.d.) 

Slope 

0.778±.348  dB/MHz 

.588±.326 

Intercept 

-11.81±2.19dB 

-10.41±2.20 

Mid  Band 

.7.88±1.48dB 

-7.42±1.57 

SNR 

1.62±.38 

1.36±.35 

ASM 

.0099±.018 

,0029±.0026 

ENT 

.5.4±1.04 

-6.42±.78 

CON 
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Slope  &  ENT 

1.498  .77 

Int  &  CON 
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ENT  4&  CON 
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Intercept 
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SNR,  slope,  ENT 
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Abstract 


Prostate  cancer  is  second  only  to  lung  cancer  as  the  cause  of  cancer  deaths  among 
American  men.  All  men  are  at  risk  for  developing  prostate  cancer,  and  as  a  man  ages, 
his  risk  of  developing  prostate  cancer  increases.  The  purpose  of  this  study  is  to 
combine  clinical,  ultrasound,  and  elastographic  features  into  a  system  to  reliably 
identify  areas  of  the  prostate  that  are  likely  to  be  cancerous.  The  radio  frequency  (RF) 
ultrasound  data  were  acquired  at  2  mm  intervals  from  78  radical  prostatectomy 
specimens.  After  acquiring  the  ultrasound  data,  the  specimen  is  sectioned  for 
histological  analysis  at  2  mm  intervals  allowing  a  comparison  of  each  ultrasound 
‘slice’  with  a  corresponding  histology  image.  The  areas  of  cancer  in  each  histology 
image  are  marked  by  indelible  ink  and  a  corresponding  region  of  interest  in  the 
ultrasound  data  set  is  then  found.  The  ultrasound  features  used  in  this  study  include 
the  basic  texture  feature  -  envelope  signal-to-noise  ratio  (SNR),  four  features  coming 
from  the  co-occurrence  matrix  and  three  features  coming  from  spectral  analysis  of  RF 
echo  signals.  Software  has  been  developed  to  compute  feature  values  at  all  points  in 
each  RF  data  ‘slice’.  Using  two  features  together  (entropy  of  a  co-occurrence  matrix, 
and  correlation  of  a  co-occurrence  matrix),  the  best  classification  performance  is 
0.8386  (area  under  receiver  operating  characteristic  curve).  Using  three  features 
together  (entropy,  signal-to-noise  ratio,  and  slope  from  spectral  analysis  of  radio 
frequency  echo  signals),  the  best  classification  performance  is  0.8541.  The 
preliminary  results  show  RF  and  envelope-detected  signal  analyses  are  diagnostically 
useful  to  discriminate  cancer  in  prostate  tissue. 
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Chapter  1  Introduction  and  Background 


1.1  Prostate  Cancer 

Prostate  cancer  is  second  only  to  lung  cancer  as  the  cause  of  cancer  deaths  among 
American  men.  All  men  are  at  risk  for  developing  prostate  cancer,  and  as  a  man  ages, 
his  risk  of  developing  prostate  cancer  increases.  Early  detection  of  prostate  cancer  can 
dramatically  reduce  morbidity.  Convenient,  noninvasive  methods  of  prostate  cancer 
detection  have  the  potential  of  improving  early  detection  and  reducing  death  rates. 

1.1.1  Introduction 

The  prostate  is  an  organ  that  is  only  present  in  men.  It  lies  just  inferior  to  the  urinary 
bladder,  which  is  shown  in  Figure  1 .  It  is  a  chestnut-shaped  organ  that  surrounds  the 
beginning  of  the  urethra.  It  is  composed  of  30  to  50  compound  tubuloalveolar  glands 
between  which  is  the  fibromuscular  stroma.  These  glands  secrete  a  milky  fluid 
during  ejaculation  that  contributes  to  semen.  The  prostate  no  longer  serves  its  main 
purpose  when  fathering  children  is  no  longer  a  goal. 
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Figure  1 :  Male  organs  [Cap  2001] 

1.1.2  What  is  Prostate  Cancer? 

Because  the  prostate  is  wrapped  tightly  around  the  urethra,  any  enlargement  can 
cause  the  flow  of  urine  to  be  restricted.  This  often  happens  as  men  get  older.  It  may 
be  due  to  a  condition  known  as  Benign  Prostatic  Hyperplasia  (BPH).  BPH  is  not 
cancer.  However,  enlargement  of  the  prostate  may  also  be  caused  by  cancer.  Prostate 
adenocarcinoma  is  a  malignuut  transformation  and  growth  of  the  glandular 
component  of  the  prostate.  This  tumor  can  spread  beyond  the  capsule  termed  capsular 
penetration,  to  the  seminal  vesicles  which  are  glands  located  next  to  the  prostate  and 
below  the  bladder,  or  to  the  lymph  nodes  which  filter  the  clear  fluid  draining  firom  the 
prostate.  The  most  common  site  of  distant  spread  is  to  bone  [Cap  2001]. 
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1.1.3  Cause  of  Prostate  Cancer 

The  cause  of  prostate  cancer  is  presently  unknown.  But  the  risk  of  getting  prostate 
cancer  has  been  shown  to  vary  with  several  factors  [Prostate  2001]: 

Age:  Below  the  age  of  40,  the  risk  of  prostate  cancer  is  extremely  low.  After  the  age 
of  50,  the  chance  of  getting  prostate  cancer  increases  rapidly  with  age.  Three-quarters 
of  all  men  with  prostate  cancer  are  over  the  age  of  65. 

Race:  Studies  in  America  have  shown  African  American  men  are  twice  as  likely  to 
develop  prostate  cancer  as  white  men. 

Family  History:  Other  members  of  the  family  having  been  diagnosed  with  prostate 
cancer  increases  the  risk  by  2  or  3  times,  particularly  if  they  were  diagnosed  with  the 
disease  at  a  young  age  or  there  have  been  several  members  of  the  family  diagnosed 
with  the  disease. 

Diet:  American  studies  have  shown  that  a  diet  high  in  animal  (saturated)  fat  may 
double  the  risk  of  getting  prostate  cancer. 

1.1.4  PSA 

Prostate  specific  antigen  (PSA)  [PRO AC  2001]  is  an  enzyme  that  is  released  into  the 
bloodstream  by  both  normal  and  cancerous  prostate  cells.  Any  condition  that  could 
cause  injury  or  irritation  to  the  prostate  gland  such  as  infection  (prostatitis)  or 
noncancerous  enlargement  of  the  prostate  (BPH)  can  cause  elevation  of  PSA. 
However,  the  possibility  of  cancer  is  higher  with  an  elevated  PSA.  A  PSA  count  of 
above  4.0ng/ml  is  usually  considered  elevated. 
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1.2  Introduction  to  Ultrasound 


1.2.1  Introduction 

By  convention,  the  limit  of  human  hearing  is  normally  taken  be  20kHz  (the  umt  Hz 
means  one  cycle  per  second).  Vibrations  with  frequencies  above  20kHz  are  said  to  be 
ultrasonic.  In  fact,  medical  ultrasound  uses  frequencies  much  higher  than  this. 
Typically,  the  range  2-10  MHz  is  used  for  medical  ultrasonic  scanners. 

Ultrasound  waves  are  usually  elastic  compression  waves,  particularly  in  liquid  or 
semi-liquid  materials,  such  as  soft-tissue  organs.  As  the  ultrasound  wave  propagates 
through  the  target,  it  will  interact  differently  with  different  types  of  tissue  or  matter. 
The  interaction  depends  on  the  acoustic  properties  of  the  tissue,  such  as  the 
attenuation,  absorption  and  scattering,  impedance  and  velocity.  The  acoustic 
parameters  depend  strongly  on  the  frequency  of  the  ultrasound,  as  well  as  other 
parameters  such  as  temperature.  The  values  for  the  speed  of  ultrasound  waves  in 
different  soft  tissues  are  very  similar.  On  modem  scanners,  it  is  assumed  that  the 
value  is  1540m/s,  which  is  a  reasonable  approximation  in  most  cases  [Lerski  1988]. 
Ultrasound  dose  not  penetrate  through  hard  tissue,  such  as  bone,  very  well  and  as  a 
result,  the  scanning  of  bones  is  not  routinely  used  in  medical  ultrasound. 

1.2.2  Ultrasound  System 

There  are  numerous  types  of  ultrasoimd  systems.  Conventional  sonographic  units  are 
comprised  of  a  transducer,  pulse  generator,  demodulator,  amplifier,  time  gain 
compensator,  digital  scan  converter,  memory  storage,  image  processing,  and  a 
display.  More  complex  ultrasound  systems  may  also  have  Doppler,  color  flow,  or 
other  electronic  features.  Figure  2  is  a  simplified  representation  of  ultrasound  unit. 

The  following  is  a  simplified  explanation  of  the  ultrasound  system.  A  more  detailed 
description  of  medical  ultrasound  can  be  found  in  [NCSU  2001].  The  transducer  (a 
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piezoelectric  crystal)  works  to  both  send  very  high  frequency  sound  waves  and 
receive  their  echoes  from  the  target.  When  sending  the  sound,  the  piezoelectric  crystal 
vibrates  in  response  to  voltage  changes  applied  to  it.  This  converts  electrical  energy 
into  mechanical  sound  wave  energy  and  introduces  motion  in  the  adjacent  medium 
(solid,  liquid,  or  gas).  These  sound  waves  are  transmitted  through  the  target  and 
reflected  at  some  of  the  boundaries.  A  reflected  sound  wave  will  travel  back  and 
strike  the  piezoelectric  crystal.  The  crystal  will  vibrate,  and  this  mechanical  stress  will 
cause  it  to  output  a  voltage  proportional  to  the  stress.  These  voltage  changes  are  then 
sent  to  amplifiers,  filters,  and  other  electronic  hardware  so  that  the  computer  display 
can  display  data  for  viewing. 


Figure  2:  Ultrasound  unit  [NCSU  2001] 


1.2.3  Medical  Ultrasound 

Ultrasound  has  proven  to  be  a  very  valuable  and  cost-effective  complementary 
medical  imaging  method,  along  with  CT  (Computed  Tomography)  and  MRI 
(Magnetic  Resonance  Imaging).  Ultrasound  tissue  characterization  techniques  are 
often  based  on  the  premise  that  disease  processes  alter  physical  characteristics  of 
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tissue  and  that  these  alterations  can  cause  observable  changes  in  acoustic  scattering 
properties  [Lizzi  1986]. 

1.2.4  Ultrasound  Image 

To  investigate  quantitative  sonographic  characterization,  it  is  important  to  convert  the 
radio  frequency  (RF)  data  into  image.  The  two-dimensional  B-mode  ultrasound 
images,  as  shown  in  Figure  6,  are  formed  by  combining  data  derived  from  a  series  of 
one-dimensional  A-line  scans.  Each  A-line  scan  is  the  backscattered  RF  signal 
recorded  at  a  single  transducer  location  resulting  from  the  transmitted  pressure  wave 
reflecting  from  scattering  sites  within  the  object  being  scanned. 

One  A-line  scan  and  its  FFT  result  are  shown  in  Figure  3  and  4.  The  ultrasound 
transmit  pulse  has  a  5  MHz  center  frequency. 
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Figure  3:  The  RF  signal  of  a  single  A-line 
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FFT  Result 
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Figure  4;  FFT  result  of  a  single  A-line 

An  image  of  the  underlying  tissue  properties  can  be  formed  by  combining  the  return 
signals  from  a  sequence  of  A-line  scans  at  different  lateral  positions.  In  order  to 
create  an  image,  that  is  recognizable  by  the  human  eye,  the  data  are  often  processed 
to  form  an  image  called  B-mode  image.  To  get  the  standard  B-mode  ultrasound 
image,  the  envelope  of  the  RF  signal  needs  to  be  found.  This  is  accomplished  by 
using  the  Hilbert  transform  [Mohanty  1987]. 

The  Hilbert  transform  is  defined  as 


(1.1) 
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The  Hilbert  transform  produces  a  sequence  that  is  the  input  data  with  a  90°  phase 
shift  (i.e.,  sines  become  cosines  and  vice  versa).  The  details  can  be  found  in  Appendix 
E.  Figure  5  shows  how  to  realize  the  Hilbert  transformation  in  a  computer  with  digital 

data. 


x(t) 


Figure  5:  The  implementation  of  Hilbert  transform  in  computer 

The  Hilbert  transform  shifts  the  phase  and  is  therefore  called  a  quadrature  filter.  A 
signal  z(0  is  called  an  analytic  signal  if 


z{t)  =  Jc(^)  +  ix{t) 


(1.2) 


where  Jc(0  is  the  Hilbert  transform  of  x{t) .  If  x(0  =  A  cos{wt  +  ,  then 

jc(/)  =  A  sin(wt  +  ^)  and  z(t)  =  . 

The  envelope  of  the  RF  signal  can  then  be  computed  by  taking  the  magnitude  of  the 
complex  sequence. 


(1.3) 
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where  V  is  the  envelope  of  the  RF  signal.  /  is  a  real  input  sequence  which  is  the 
original  data.  Q  is  the  imaginary  part  which  is  the  actual  Hilbert  transform.  The 
imaginary  part  is  a  version  of  the  original  real  sequence  with  a  90°  phase  shift. 

This  method  was  used  to  compute  the  envelope  of  the  RF  signal  in  Figure  3.  The 
amplitude  of  the  envelope-detected  signal,  represented  by  brightness,  is  displayed  in  a 
standard  B-mode  ultrasound  image.  See  Figure  6.  It  should  be  noted  that  the  B-mode 
image  is  useful  for  certain  applications,  it  does  not  contain  all  of  the  original 
ultrasound  information.  Other  signal  processing  and  image  representation  modes  are 
also  very  useful  in  medical  ultrasound. 


Figure  6:  Ultrasound  image  of  prostate 
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1.3  Goal 


1.3.1  The  Clinical  Problem 

The  problem  of  finding  an  accurate  method  for  the  detection  and  staging  of 
adenocarcinoma  of  the  prostate  (ACP)  is  one  of  the  outstanding  challenges  in  the 
field  of  diagnostic  medicine.  The  principal  methods  used  to  detect  and  confirm  the 
presence  of  the  ACP  are  digital  rectal  examination  (DRE),  the  serum  Prostatic 
Specific  Antigen  assay  (PSA),  and  transrectal-ultrasound-guided  (TRUS)  prostatic 
biopsy. 

Unfortunately,  current  individual  diagnostic  imaging  and  laboratory  tests  for  the 
detection  and  staging  of  ACP  perform  poorly.  The  PSA  test  now  plays  a  central  role 
in  screening  for  ACP  because  it  is  inexpensive  and  relatively  sensitive.  The  problem 
is  that  there  is  considerable  overlap  between  PSA  values  for  patients  with  cancer  and 
those  with  no  cancer,  especially  for  the  group  of  older  males  with  benign  prostatic 
hypertrophy  (BPH).  For  this  reason,  an  elevated  PSA  must  be  followed  by  a 
confirmatory  test.  The  DRE  and  TRUS  alone  have  proven  to  be  insufficiently 
sensitive  to  be  used  either  as  screening  studies  or  as  studies  that  can  reliably  guide  a 
prostate  biopsy  in  a  patient  with  an  elevated  PSA  [Garra  1998].  At  present,  most 
clinicians  do  not  consider  TRUS  imaging  to  be  adequate  for  detecting  suspicious 
regions.  It  is  considered  to  be  inadequate  because  its  sensitivity  is  not  sufficient  to 
reveal  the  presence  of  cancer  and  to  direct  a  biopsy  needle  to  the  cancer  site  [Feleppa 
1996].  One  ultimate  goal  of  this  research  would  be  to  create  a  system  that  can  reliably 
direct  a  biopsy  needle  to  sites  that  are  likely  to  be  cancerous. 

1.3.2  Recent  Research  on  Quantitative  Ultrasound 

Because  of  the  limitations  of  the  conventional  methods  in  this  field,  a  number  of 
investigators  are  using  quantitative  techniques  to  offer  improved  sensitivity  and 
specificity  for  prostate  cancer.  Feleppa  et  al.  [Feleppa  1996]  used  quantitative 
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ultrasonic  radio  frequency  (RF)  spectral  features  and  elementary  discriminant 
functions  to  discriminate  cancerous  from  noncancerous  prostate  tissue.  Feleppa  et  al. 
[Feleppa  1997]  reported  a  value  of  Az  of  0.79  ±  0.07  when  using  their  RF  features  to 
make  the  discrimination  between  cancerous  and  noncancerous  tissue,  and  a  value  of 
Az  of  0.60  ±  0.21  from  clinical  interpretation  of  the  conventional  images.  And 
improved  results  using  RF  features  have  been  achieved  in  Feleppa’s  recent  study 
[Feleppa  2001].  The  work  of  Feleppa  et  al.  is  limited  to  the  microscopic  features. 
Huynen  et  al.  [Huynen  1994]  investigated  the  discriminating  power  of  macroscopic 
features  from  ultrasonic  image  texture.  Ophir  and  colleagues  [Ophir  1997]  have 
recently  measured  the  elastogram  for  human  prostate  tissue  in  vitro.  The  elastogram 
is  a  display  of  tissue  hardness  deduced  from  the  local  tissue  strain  that  occurs  in 
response  to  an  externally  applied  static  compression.  It  has  been  shown  to  be  useful  in 
the  characterization  of  malignant  breast  tumors  that  are  harder  than  benign  lesions 
[Garra  1997],  and  is  expected  to  be  useful  in  prostate  cancer  detection  since  prostatic 
malignancies  are  also  often  “hard”  on  palpation. 

1.3.3  Goal 

The  history  of  developments  in  the  field  of  quantitative  analysis  of  ultrasound  and 
elastogram  features  strongly  suggests  that  investigators  turn  their  attention  to  an 
analysis  of  an  optimal  combination  of  the  microscopic  RF  features,  the  macroscopic 
image  texture  features,  and  measures  of  tissue  hardness  derived  from  the  elastogram. 

So  the  ultimate  goal  of  this  project  is  to  combine  features  derived  from  ultrasound 
(US)  images,  US  radio-frequency  (RF)  data,  tissue  elasticity  imaging,  and  clinical 
data  such  as  PSA  into  a  computerized  system  for  displaying  prostate  images  that 
indicate  probable  locations  of  cancer  [Garra  2000].  Only  features  derived  from  US 
images  and  RF  data  are  used  to  discriminate  cancerous  from  noncancerous  prostate 
tissue  in  the  study  of  this  thesis. 
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1.4  Thesis  Structure 


This  thesis  is  orgsnized  as  followsi  Chapter  2  introduces  the  features  used  in  this 
research,  which  come  from  ultrasound  images  and  ultrasoimd  radio-frequency  data. 
Chapter  3  outlines  the  evaluation  methods,  which  include  correlation,  t-test, 
Mahalanobis  distance,  discriminant  analysis  and  performance  evaluation.  Chapter  4 
develops  a  MATLAB-based  tool,  with  which  we  can  draw  Region  of  Interest  (ROI) 
on  a  pathology  image  to  get  the  corresponding  data  in  the  ultrasound  radio- frequency 
sets  and  compute  the  corresponding  US  features  for  the  chosen  ROI.  In  Chapter 
5,  all  the  evaluation  methods  introduced  in  Chapter  3  will  be  applied  to  the  data 
acquired  via  the  feature  computation  software  developed  in  Chapter  4.  Finally, 
Chapter  6  summarizes  the  conclusions  reached  in  the  thesis,  as  well  as  suggesting 
futiure  directions  of  research. 
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Chapter  2:  Parameter  Extraction  Methods 


us  data  can  be  processed  in  a  wide  variety  of  methods.  The  availability  of  modem 
digital  signal  processing  hardware  and  software  opens  up  the  possibility  of  an  excess 
of  available  signal  processing  techniques.  A  major  goal  of  this  project  is  to  determine 
the  most  suitable  signal  processing  techniques  for  discriminating  prostate  cancer. 
Following  a  series  of  discussions  with  experienced  medical  ultrasound  investigators 
[Garra  1993,  Wagner  1983,  Wear  1995],  a  set  of  eight  features  was  chosen  for  further 
investigation.  This  choice  was  based  on  a  combination  of  the  ability  of  these  features 
to  work  on  the  detection  of  cancer  in  other  organs  and  by  an  understanding  of  the 
underlying  physical  mechanisms.  The  choice  of  features  were  the  basic  texture 
feature  -  envelope  signal-to-noise  ratio  (SNR),  four  features  coming  from  the  co¬ 
occurrence  matrix  and  three  features  coming  from  spectral  analysis  of  RF  echo 
signals.  The  following  is  a  description  of  the  signal  processing  algorithm  associated 
with  each  feature  and  the  underlying  rationale  for  use. 


2.1  Image  Statistics 

There  are  many  features  based  on  the  first-order  statistics.  Tissue  ultrasound  Signal  to 
Noise  Ratio  (SNR)  was  chosen  in  our  study.  This  may  be  useful  because  it  may  be 
another  way  to  measure  the  relative  contribution  of  specular  vs.  diffuse  tissue 
backscatter  components.  These  have  been  previously  shown  to  be  of  value  for 
characterization  of  both  liver  and  kidney  tissue  [Garra  1989,  Garra  1994]. 

2.1.1  Tissue  Ultrasound  Signal  to  Noise  Ratio 

SNR  =  ^la  (2.1) 
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with 


where  [j.  is  mean  gray  level,  an  indicator  of  average  tone  of  the  image.  <7  is  variance 
of  gray  level,  an  indicator  of  how  much  variation  exists  in  the  image  with  respect  to 
the  average  tone,  n  is  the  total  number  of  pixels,  and  f{x,y)  is  the  gray  level  in  pixel 

2.1.2  Rayleigh  Distribution 

The  probability  distribution  of  a  narrow  band  noise  process  n(t)  can  be  derived  by 
considering  a  complex  phasor 

n(t)  =  r{t)  exp(y<3(0)  (2-2) 

where  r(t)  is  the  magnitude  or  envelope  and  (pit)  is  the  phase.  This  can  also  be 
written  in  terms  of  its  real  and  imaginary  parts  (in-phase  and  quadrature  components) 
as 

n(0  =  x(0  +  y>(0 
x{t)  =  r(0cos(^(0) 
y(t)  =  r{t)sm{<pit)) 
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If  both  random  processes  x(t)  and  y(t)  are  Gaussian  distributed  with  the  same 
variance  and  zero  mean  then  the  probability  density  functions  P(x)  and  P(y)  are  given 
by 


P(x)  = 


^  exp( 


ny)  =  ( 


Ina 


-  -v' 

-y  exp(^) 
Ley 


Assuming  x  and  y  are  statistically  independent  then 


P{x,y)  =  P{x)P{y) 

1 


Itict' 


-exp(- 


2cr" 


) 


(2.3) 


Transforming  differential  areas  using 

dxdy  =  rdrd(p 

Gives  the  joint  probability  density  function  as 


P{r,(p)  = 


r 


(2.4) 


Since  this  is  independent  of  phase  the  random  variables  r  and  phi  are  statistically 
independent  and  therefore 

P{r,<p)  =  P{r)P{(p) 
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Then 


P{(p)  =  1°  P{r,<p)dr 


for  0<(p<27r 


(2.5) 


Itc 


P{r)=  P{r,(p)d(p 


for  r  >  0 


(2.6) 


This  is  normally  called  the  Rayleigh  Distribution  [Bourke  2001].  The  Rayleigh 
distribution  is  a  special  case  of  the  Weibull  distribution  [Hahn  1967]. 


2.1.3  Experimental  Result  for  Prostate  and  Phantom 

In  using  quantitative  techniques  to  analyze  the  backscattered  RF  signal,  it  is  helpful  to 
have  a  model  that  relates  physical  interaction  between  the  ultrasound  pulse  and  the 
scattering  medium  to  measured  quantities  derived  from  the  observed  signal.  The  most 
widely  accepted  model  of  ultrasound  scattering  in  soft  tissue  has  been  developed  by 
Wagner  et  al.  [Wagner  1986,  Wagner  1987].  In  this  model,  the  scatterers  are  divided 
into  three  classes.  The  major  two  classes  are  described  as  follows: 


The  first  class  consists  of  a  large  number  of  randomly-located  scatterers  whose 
structure  is  much  smaller  than  the  wavelength.  When  the  number  of  scatterers  per 
resolution  cell  is  sufficiently  large  and  the  phases  of  the  complex  phasors  are 
randomly  distributed  between  0  and  2  ;r ,  then,  the  real  and  imaginary  parts  of  the 
resulting  accumulated  signal  have  a  circular  Gaussian  joint  pdf  given  by  Equation 
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(2.3).  The  pdf  of  the  magnitude  signal  that  is  displayed  in  B-mode  images  is  given  by 
a  Rayleigh  pdf  Equation  (2.6). 

The  second  class  of  tissue  scatterers  is  nonrandomly  distributed  with  long-range 
order.  It  contributes  a  specular  backscattered  intensity.  When  these  scatterers  are 
present  along  with  the  random  diffuse  scatterers,  the  resulting  magnitude  signal  is  no 
longer  Rayleigh  distributed. 

From  the  above  discussion,  we  know  a  histogram  of  the  gray-scale  pixel  values  for 
the  B-scan  image  with  fully  developed  speckle  will  follow  a  Rayleigh  probability 
density  function  (pdf)-  The  theoretical  value  of  SNR  for  Rayleigh  statistics  is  found  to 
be  1.91.  [Wagner  1983].  For  an  image  with  specular  as  well  as  diffuse  scatters,  the 
value  will  be  lower,  in  most  instances. 

Figure  7  is  the  histogram  of  a  phantom  magnitude  image  shown  in  Figure  9.  This 
phantom  has  predominately  small  scatters  and  contains  very  few  specular  scatters. 

The  probability  distribution  of  gray  level  should  follow  a  Rayleigh  distribution. 

Figure  8  shows  that  it  does  follow  a  Rayleigh  distribution.  It  should  therefore  exhibit 
a  ///or  value  very  close  to  1.91.  Figure  9  demonstrates  that  sub-regions  calculated 
from  within  the  phantom  do  in  fact  exhibit  a  ///cr  close  to  1.91. 

Figure  10  shows  a  histogram  of  a  typical  prostate  magnitude  image.  This  image  is 
shown  in  Figure  12  with  the  calculated  signal  to  noise  ratios  for  several  regions  of 
interest  in  the  image.  Figme  1 1  is  the  probability  distribution  of  this  image,  which 
shows  substantial  non-Rayleigh  behavior.  This  non-Rayleigh  behavior  is  also 
revealed  by  the  envelope  SNR  less  than  1.91.  All  the  sub-regions  have  values  less 
than  1.91  and  it  indicates  that  the  sub-regions  contain  specular  as  well  as  diffuse, 
randomly  positioned  scatters.  This  is  an  expected  result  since  tissue  rarely  exhibits 
purely  diffuse  scattering.  Tests  on  other  sections  have  yielded  similar  results. 
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#  occurrences 


Figure  7:  Histogram  of  envelope  values  from  phantom  shown  in  Figure  9 


Figure  8:  Probability  distribution  of  phantom  envelope 
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1.902 


1.900 


1.870 


1.898 


Figure  9:  Signal  to  noise  ratio  values  for  sub-regions  of  a  tissue  mimicking  calibration 
phantom.  Average  value  for  the  four  regions  is  1.893. 
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Figure  10:  Histogram  of  envelope  values  from  prostate  shown  in  Figure  12 


Figure  1 1 :  Probability  distribution  of  prostate  envelope 


Figure  12:  Signal  to  noise  ratio  values  for  sub-regions  of  a  prostate  section  (case  18) 
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2.1.4  Signal-to-Noise  Ratio  Based  on  Split-spectrum 

Based  on  the  suggestion  from  Dr.  Keith  A. Wear  and  Dr.  Robert  F.Wager,  we  tried  to 
split  the  spectrum  to  calculate  the  signal  to  noise  ratio  (SNR)  for  the  high  band  and 
low  band  separately.  When  we  use  the  phantom  data  as  the  input,  the  result  should 
also  be  close  to  1.91. 


To  calculate  the  SNR  based  on  split-spectrum,  there  are  several  steps  to  finish  this 
task: 

1)  Get  the  complex  (Real,  Imaginary)  FFT  of  the  trace  of  interest  (from  a  given  ROI). 

2)  Split  the  spectrum  into  two  parts  -  higher  band  and  lower  band. 

3)  Take  the  IFFT  of  the  complex  data  set  from  each  band. 

4)  Detect  envelope  of  each  signal  and  get  //  /  cr  as  before. 


The  most  important  step  is  how  to  split  the  spectrum  into  two  halves.  As  shown 
before,  the  center  frequency  is  5MHz,  and  the  sampling  frequency  is  48MHz.  We  can 
use  the  following  formula  to  change  the  center  frequency  into  Nyquist  frequency. 


fn  ~ 


fc 

O.Sxfs 


5M 

0.5x48M 


0.2083 


(2.7) 


where  is  the  center  frequency  in  Nyquist  frequency.  is  the  center  frequency, 
and  fs  is  the  sampling  frequency. 

High-pass  and  low-pass  filters  are  needed  to  split  the  spectrum.  It  might  be  natural  to 
use  the  center  frequency  as  the  cut-off  frequency  to  devise  the  filters.  But  an  abrupt 
truncation  will  introduce  “ringing”  artifacts  when  we  do  the  IFFT.  So  we  want  the 
window  for  the  upper  half  and  the  window  for  the  lower  half  to  overlap.  The 
Hamming  window  [MATLAB  2000]  was  used  to  devise  the  high-pass  and  low-pass 
filters.  We  use  cut-off  frequency  0.1883  to  devise  the  high-pass  filter  and  0.2283  for 
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.  ''  . 

low-pass  filter.  The  coefficients  of  a  Hamming  window  are  computed  from  the 
following  equation. 

+  1]  =  0.54 -0.46  cos(2;r—),  k  =  (2.8) 

n-\ 

Figure  13  shows  the  17-point  Hamming  window  used  in  this  research. 


Sequence 

Figure  13: 17-point  Hamming  window 

Figure  14  shows  the  frequency  response  of  the  high-pass  filter.  Figure  15  shows  the 
frequency  response  of  the  low-pass  filter. 
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Normalized  Frequency  (xn  rad/sample) 

Figure  14:  Frequency  response  of  high-pass  filter 
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Figiu'e  15:  Frequency  response  of  low-pass  filter 
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When  we  split  the  spectrum,  the  bandwidth  for  each  half  clearly  becomes  smaller 
than  for  the  full  bandwidth  case.  Since  the  size  of  the  resolution  cell  is  inversely 
proportional  to  the  bandwidth,  the  resolution  cell  for  the  split  cases  will  be  larger  than 
the  original.  Therefore,  there  should  be  an  even  better  chance  that  the  SNR  will  be 
closer  to  1.91  than  originally.  The  more  scatters  per  resolution  cell  -  all  else  being 
equal  -  the  closer  the  SNR  will  be  to  1 .91 .  We  have  randomly  chosen  five  ROIs  in 
one  phantom  image  and  calculated  the  SNR  of  lower  band,  higher  band  and  the 
normal  SNR.  The  results  are  shown  in  Table  1 .  The  average  value  for  the  SNR 
generated  from  the  lower  band  data  set  is  1 .899  and  1 .909  for  the  higher  band  data  set 
and  2.101  for  the  normal.  This  demonstrates  that  the  SNR  values  generated  from  the 
lower  band  and  higher  band  data  set  are  closer  to  1 .91  than  the  SNR  value  generated 
via  normal  way. 


SNRl 

SNRh 

SNRn 

ROI  1 

1.910 

1.876 

2.132 

ROI2 

1.957 

1.965 

2.052 

ROIS 

1.846 

1.823 

2.119 

ROI  4 

1.905 

1.923 

2.079 

ROIS 

1.879 

1.956 

2.127 

Average  Value 

1.899 

1.909 

2.102 

Table  1 :  Signal  to  noise  ratio  values  for  sub-regions  of  a  tissue  mimicking  calibration 
phantom.  Where  L  means  low  band,  H  means  high  band,  and  N  means  normal. 


25 


2.2  Image  Texture 


N 


There  are  several  paradigms  for  measuring  texture  mathematically.  A  commonly  used 
one  is  based  on  the  gray  level  co-occurrence  matrix  (GLCM),  also  known  as  the 
spatial  gray  level  dependence  matrix  (SOLD)  in  the  literature.  It  has  been  categorized 
as  an  efficient  approach  to  texture  analysis  [Frederick  2000,  Garra  1993].  Using  this 
approach  representative  texture  features  can  be  measured  to  characterize  the  property 
of  each  region  of  interest  (ROI).  Features  based  on  the  co-occurrence  matrix  have 
already  been  demonstrated  to  be  valuable  for  prostatic  cancer  [Huynen  1994]. 

2.2.1  Co-occurrence  Matrix 

The  first  step  in  computing  the  co-occurrence  matrix  is  to  demodulate  the  radio 
frequency  (RF)  signal  to  produce  an  envelope-detected  image.  This  is  accomplished 
by  using  the  Hilbert  transform.  Details  can  be  found  in  Chapter  1 .  The  Hilbert 
transform  shifts  the  input  data  with  a  90°  phase.  The  envelope  of  the  RF  can  then  be 
computed  by  taking  the  magnitude  of  the  original  and  90°  phase  shift  time  sequences. 

Then  the  next  step  is  to  compute  the  co-occurrence  matrix,  C .  This  is  an  NxN  matrix 
where  N  is  the  number  of  the  gray  levels.  Each  of  the  elements  of  C ,  Cy ,  takes  on  a 

value  that  is  the  number  of  times  a  pixel  has  the  value  i,  and  its  “neighbor”  pixel  has 
the  value].  The  neighbor  pixel  is  defined  as  being  of  a  given  radial  distance  of  pixels 
d  at  angle  0.  The  values  in  the  matrix  are  then  normalized  to  represent  probabilities  of 
specific  gray  level  combination.  Different  co-occurrence  matrices  can  be  constructed 
by  changing  the  direction  and  distance  between  pixel  pairs  when  defining  spatial 
relationships.  The  co-occurrence  matrix  for  distanced  and  angled  and  N  possible 
gray  level  values  can  be  foimd  fi'om  Equation  2.9. 
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GLCM  = 


(2.9) 


Pd,eO-’^) 


PdA^^^) 

PdA^^^) 


PdA^^^)  PdA^^) 


PdA^’^) 


An  example  is  given  as  following, 


GLCM  = 


0 

1 

2 

1 

0 


0 

1 

2 

1 

0 


0  1 
0  1 
1  0 
0  2 
1  0 


2 

1 

0 

0 

1 


(2.10) 


This  is  a  matrix  of  a  5x5  image  with  only  3  gray  levels.  The  corresponding  co¬ 
occurrence  matrix  {d  =  l ,  0  =  45° )  is  shown  in  Equation  (2.1 1) 


C  = 


4  2  0 
2  3  2 
1  2  0 


(J  =  l,  ^  =  45°) 


(2.11) 


Alternatively  the  neighbor  pixel  can  be  defined  as  being  dx  pixels  away  in  the  vertical 
direction  and  dy  pixels  away  in  the  lateral  direction.  The  actual  values  of  dx  and  dy 
are  5  in  this  research.  Methods  for  choosing  an  effective  neighbor  distance  can  be 
found  in  [Mia  1999]. 

2.2.2  Features  Based  On  Co-occurrence  Matrix 

Once  the  co-occurrence  matrix  is  calculated,  parameters  that  might  be  useful  in 
distinguishing  tissue  types  can  be  extracted. 
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Angular  Second  Moment  (ASM) 

or  Energy: 

i  j 

An  indicator  of  uniformity  or 

smoothness.  Homogeneous  textures  will 
have  a  higher  value  than  inhomogeneous 

ones  because  smooth  textures  have  more 

concentrated  densities  than  rough 

textures.  Rough  textures  have  densities 

with  higher  spread  or  variance. 

Contrast  (CON)  or  Difference  Moment: 

i  J 

An  indicator  of  gray  level  variance  and 

therefore  smoothness. 

Correlation  (COR): 

f  -  ‘  ^ 

(T  a 

^  y 

An  indicator  of  underlying  structure  in  a 

texture.  The  absolute  value  of  this 

measure  will  be  large  if  the  image  has 

some  sort  of  structure  such  as  a  smooth 

background  or  repeated  sharp  edges  over 

a  given  region. 

Entropy  (ENT): 

i  j 

An  indicator  of  the  amount  of 

information  provided  by  pairwise 
interactions  of  image  pixels  separated  by 

a  distance  d. 

Inverse  Difference  Moment  (IDM): 

Emphasizes  small  changes  and  subtle 
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V 


0 


textures. 

Dissimilarity  (DIS): 

Measures  the  degree  of  dissimilarity 

between  pixels. 

<  j 

Table  2:  Haralick’s  texture  parameters 

The  joint  probability  of  gray  levels/  and  j  in  the  direction  ^  at  a  distance  d  is  p(i,j). 
N  is  the  number  of  gray  levels  in  the  digitized  image,  ju^  and  ju^  are  means  of  row  and 
column  sums  separately,  and  cr^  are  standard  deviations  of  row  and  column  sums 
separately. 

Many  texture  measures  can  be  calculated  from  the  co-occurrence  matrix.  One  set  of 
such  texture  parameters  are  known  as  Haralick’s  texture  measures  [Frederick  2000]. 
Some  of  them  are  summarized  in  Table  2.  ASM,  COR,  CON  and  ENT  are  chosen 
from  the  above  listed  six  features 


2.3  Spectral  Analysis  of  Radio  Frequency  (RF) 

Backscatter  is  the  reflection  from  the  scattering  sites  of  the  pressure  wave  back  in  the 
direction  of  the  transducer  that  transmitted  the  pulse.  The  ultrasonic  backscatter 
coefficient  is  a  useful  parameter  that  describes  the  scattering  efficiency,  as  a  function 
of  ultrasonic  frequency,  of  a  tissue  or  material.  The  backscatter  coefficient  as  a 
function  of  frequency  of  the  tissue  can  be  used  to  generate  three  useful  features  - 
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slope,  intercept  and  mid-band  value.  These  features  have  been  successfully  used  by 
Feleppa  [Feleppa  1996]  to  differentiate  prostatic  cancer  from  benign  tissue. 

A  reference  phantom  method  for  measuring  ultrasonic  backscatter  coefficients  was 
used  in  this  study.  With  the  reference  phantom  method,  spectra  obtained  in  tissues  are 
compared  with  spectra  obtained  from  a  phantom  with  known  scattering  and 
attenuation  properties.  The  data  were  compensated  for  attenuation  due  to  intervening 
tissues  between  the  transducer  and  the  ROI. 

The  formula  [Wear  1995]  used  to  compute  the  tissue  backscatter  coefficient  as  a 
function  of  frequency  rjXf)  is 

riXf)  =  V  (2.12) 

OpC/j  1=1 

where  rj,  (/)  is  tissue  backscatter  coefficient  as  a  function  of  frequency.  7^  (/)  is 
backscatter  coefficient  of  reference  phantom.  5,  (/)  and  Sp  (/)  are  average  power 
spectra  measured  from  tissue  and  phantom,  respectively,  (/)  is  reference  phantom 
attenuation  coefficient  and  z  is  the  distance  from  the  transducer  to  the  center  of  the 
ROI.  In  vivo  measurements  are  presumed  to  involve  n  tissue  layers  with  attenuation 

coefficients  «,  (/)  and  thickness  z,.. 

The  reference  phantom  was  a  tissue-like  slurry  containing  glass  beads  and  graphite  in 
agar  particles  suspended  in  a  water-alcohol  solution.  The  attenuation  coefficient  was 
0.57dB/MHz-cm.  The  phantom  used  to  test  the  backscatter  coefficient  measurement 
method  consisted  of  glass  beads  embedded  in  agar.  Figure  16(b)  is  the  phantom 
image. 
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(a)  RF  image 


(b)  Phantom  image 


(c)  ROIonRF 

Figure  16:  B-mode  images  of  data  used  for  calculating  the  backscatter  coefficient 
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To  simplify  the  model,  the  tissue  attenuation  coefficient  a,  (/)  was  assumed  to  be 
constant.  A  value  of  0.5dB/MHz-cm  was  selected  because  it  is  commonly  considered 
to  be  representative  of  moderately  vascular  tissues  [Feleppa  1996]. 

The  RF  feature  is  an  ROI-based  spectral  analysis.  In  order  to  facilitate  the  ROI 
selection,  an  interactive  user-based  tool  was  developed. 

Figure  16(a)  shows  one  ultrasound  image  used  in  this  test.  Figure  2.16(b)  shows  the 
corresponding  phantom  image.  And  Figure  16(c)  is  one  ROI  inside  the  ultrasound 
image  chosen  by  user. 

2.3.1  Average  Power  Spectra  Measured  from  Tissue  and  Phantom 

After  specifying  the  ROI  on  a  B-mode  image,  the  RF  data  samples  along  each  scan¬ 
line  segment  within  the  ROI  were  multiplied  by  a  Hamming  window.  The  windowed 
data  were  subjected  to  a  fast  Fourier  transform  (FFT),  and  the  squared  magnitude  of 
the  computed  spectrum  was  derived.  Spectral  results  for  all  scan  line  segments  within 
the  ROI  were  averaged  to  form  an  estimate  of  the  average  power  spectrum.  The 
average  power  spectra  measured  from  tissue  S,  (/)  and  phantom  (/)  are  shown  in 
Figure  17  and  Figure  18. 


32 


lagnitude 


2.3.2  Backscatter  Coefficient  of  Reference  Phantom 

The  backscatter  coefficient  of  reference  phantom  (/)  is  a  known  number,  which  is 
shown  in  Figure  19  and  listed  in  Appendix  D. 


Backscatter  coefficient  of  reference  phantom 


Figure  19:  Backscatter  coefficient  of  reference  phantom 

2.3.3  Tissue  Backscatter  Coefficient  as  a  Function  of  Frequency 
The  tissue  backscatter  coefficient  as  a  function  of  firequency  7,(/)  was  calculated  by 
using  Equation  2.12  and  shown  in  Figure  20.  Then  the  spectrum  was  converted  to 
decibel  which  is  shown  in  Figure  21.  Linear  regression  analysis  was  applied  to 
compute  the  intercept,  slope  and  mid-band  value  (value  of  the  fit  at  the  center 
frequency)  on  the  decibel  format  curve.  The  analysis  was  performed  over  the 
frequency  range  of  3.0  to  7.0MHz,  Figure  22. 
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Chapter  3:  Evaluation  Methods 


This  chapter  contains  a  review  of  several  data  analysis  techniques  such  as  correlation 
analysis,  t-test,  Mahalanobis  distance,  discriminant  analysis  and  performance 
analysis,  which  will  be  used  in  Chapter  5  to  analyse  the  parameters  described  in 
Chapter  2. 


3.1  Correlation  Analysis 

Statistical  correlation  refers  to  a  quantifiable  relationship  between  two  variables. 
Furthermore,  it  is  a  measure  of  the  strength  and  direction  of  that  relationship.  The 
strength  and  direction  of  a  correlation  are  indicated  by  the  correlation  coefficient. 
Computing  the  correlation  coefficients  provides  an  efficient  method  to  identify  all 
redundant  features  within  a  group.  A  redundant  feature  provides  little  or  no  new 
information  to  aid  in  the  task  of  distinguishing  between  samples  from  the  two  classes 
[Mia  1999]. 

3.1.1  Correlation  Coefficient 

The  correlation  coefficient  (r)  is  a  number  between  -1  and  1  that  measures  the  degree 
to  which  two  variables  (X  and  Y)  are  linearly  related.  If  there  is  a  perfect  linear 
relationship  with  positive  slope  between  the  two  variables,  the  correlation  coefficient 
is  1 ;  There  is  positive  correlation  (r  >  0)  when  cases  with  large  values  of  X  also  tend 
to  have  large  values  of  Y  whereas  cases  with  small  values  of  X  tend  to  have  small 
values  of  Y.  If  there  is  a  perfect  linear  relationship  with  negative  slope  between  the 
two  variables,  the  correlation  coefficient  is  -1;  There  is  negative  correlation  (r  <  0) 
when  cases  with  large  values  of  X  tend  to  have  small  values  of  Y  and  vice  versa.  A 
correlation  coefficient  of  0  means  that  there  is  no  linear  relationship  between  the 
variables.  Correlation  coefficients  give  no  information  about  cause  and  effect. 
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Similarly  they  provide  misleading  information  if  the  relationship  between  X  and  Y  is 
non-linear. 

Scatter  plots  are  useful  tools  to  interpret  the  correlation  coefficient.  Different  types  of 
correlations  are  shown  in  Figure  23 


(c)  (d) 


Figure  23:  Scatter  plots  of  different  types  of  correlations  (a)  Positive  correlation  (b) 
Negative  correlation  (c)  No  correlation  (d)  Moderate  positive  correlation 

There  are  a  number  of  techniques  for  measuring  correlation  coefficients.  The  most 
popular  are  Person’s  Product  Moment  and  Spearman’s  Rank. 


38 


3.1.2  Pearson’s  Product  Moment  Correlation  Coefficient 
Pearson's  product  moment  correlation  coefficient,  usually  denoted  by  r,  is  one 
example  of  a  correlation  coefficient.  It  is  a  measure  of  the  linear  association  between 
two  variables  that  have  been  measured  on  interval  or  ratio  scales,  such  as  the 
relationship  between  height  in  inches  and  weight  in  pounds.  However,  it  can  be 
misleadingly  small  when  there  is  a  relationship  between  the  variables  but  it  is  a  non¬ 
linear  one  [Minitab  1997]. 

We  can  use  the  following  formula  to  compute  Pearson's  r.  The  correlation  between 
two  variables  x  and  y  is  defined  as  the  covariance  of  x  with  y  divided  by  the  product 
of  the  standard  deviation  of  x  and  the  standard  deviation  of  y: 


with 


^9'  = 


cov 


y 


COV 


X}’ 


^(x-x)(y-y) 
«  -1 


(3.1) 


where  cov  is  the  covariance  between  two  variables  x  and  y ,  x  and  5,  are  the 

sample  mean  and  standard  deviation  for  the  first  sample,  and  y  and  Sy  are  the  sample 
mean  and  standard  deviation  for  the  second  sample. 

If  there  are  multiple  features,  we  can  build  the  correlation  matrix  with  equation  (3.2). 
Computing  the  correlation  matrix  provides  an  efficient  method  to  identify  the 
redundant  features  within  a  group. 
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fn  '*2. 

ri2  '*22 


’«2 


ri„  r^„  ••• 


r.  <1.0  i,j  =  \,...,n 


with 


COVi, 

a^cxj 


r..  =  r.  r-  =  1 

'tj  'jt  n 


COV;;  = 


(3.2) 


where  cr,.  and  are  the  standard  deviation  of  the  i*’’  and  j*'’  feature.  N  is  the  total 
number  of  samples  and  n  is  the  number  of  features.  The  correlation  matrix  is 
symmetric  about  the  major  axis. 


The  strength  of  correlation  can  be  indicated  by  magnitude  (absolute  value).  For 
example,  -0.9  is  just  as  strong  as  0.9  except  the  direction.  Table  3  shows  the 
characterizations  of  Pearson  r . 


Data  Range 

Correlation 

.90tol 

very  high 

.70  to  .89 

ffigh 

.50  to  .69 

Moderate 

.30  to  .49 

low 

.00  to  .29 

little  if  any 

Table  3:  Characterizations  of  Pearson  r 
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So  the  Pearson  product  moment  correlation  coefficient  can  be  used  to  measme  the 
degree  of  linear  relationship  between  two  variables.  The  correlation  coefficient 
assumes  a  value  between  -1  and  +1.  If  one  variable  tends  to  increase  as  the  other 
decreases,  the  correlation  coefficient  is  negative.  Conversely,  if  the  two  variables  tend 
to  increase  together  the  correlation  coefficient  is  positive  [Minitab  1997]. 

3.1.3  Limitations  of  the  Correlation  Tests 

Correlation  does  not  imply  causality.  A  significant  correlation  does  not  necessarily 
mean  cause  and  effect.  It  should  be  noted  that  Pearson  r  computations  are  sensitive  to 
extreme  values  in  the  data. 


3.2  t-test 

The  t-test  assesses  whether  the  means  of  two  groups  are  statistically  different  from 
each  other.  The  t-test  will  be  used  to  help  find  the  most  promising  features  among  the 
features  introduced  in  Chapter  2  [Minitab  1997]. 

For  example,  in  Figure  24,  there  are  three  different  possible  outcomes,  labeled 
medium,  high  and  low  variability.  Notice  that  the  differences  between  the  means  in 
all  three  situations  are  exactly  the  same.  The  only  thing  that  differs  between  these  is 
the  variability  or  “spread”  of  the  scores  around  the  means.  A  small  difference 
between  means  will  be  hard  to  detect  if  there  is  lots  of  variability  or  noise.  A  large 
difference  between  means  will  be  easily  detectable  if  variability  or  noise  is  low.  This 
way  of  looking  at  differences  between  groups  is  directly  related  to  the  signal-to-noise 
metaphor  -  differences  are  more  apparent  when  the  signal  is  high  and  the  noise  is  low 
[Trochim  2001]. 
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Figure  24:  Deference  between  means  [Trochim  2001] 

There  are  five  factors  that  contribute  to  whether  the  difference  between  the  means  of 
two  groups  can  be  considered  significant:  the  difference  between  the  means  of  the 
two  groups,  the  overlapping  degree  between  the  groups,  the  number  of  subjects  in  the 
two  groups,  the  alpha  level  used  to  test  the  mean  difference  (how  confident  that  there 
is  a  mean  difference),  and  whether  a  directional  (one-tailed)  or  non-directional  (two- 
tailed)  hypothesis  is  being  tested  [Minitab  1997]. 

There  are  three  types  of  t-test:  paired  t-test  (correlated  t-test),  equal  variance  t-test 
(pooled  variance  t-test)  and  unequal  variance  t-test  (separate  variance  t-test).  In  this 
study,  it  is  necessary  to  tell  the  difference  between  independent  sample  means  with 
unequal  variance.  The  test  statistic  t  is  calculated  by  Equation  (3.2) 


s 


with 


(3.2) 
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A  p-value  below  0.05  is  generally  considered  statistically  significant,  while  one  of 
0.05  or  greater  indicates  no  difference  between  the  groups. 


3.3  Mahalanobis  Distance 

An  alternative  distance  measure  that  is  in  common  use  is  the  Mahalanobis  distance. 
This  is  similar  to  the  Bayesian  distance  in  that  it  takes  into  account  the  shape  of  the 
covariance  matrix  of  the  class  model.  However,  the  derivation  of  the  Mahalanobis 
distance  formula  assumes  that  the  covariance  matrices  of  each  class  are  the  same  in 
order  to  simplify  the  calculations  involved.  Thus  it  is  valid  to  use  the  Mahalanobis 
distance  measure  if  the  data  for  each  class  are  similarly  distributed.  However,  there  is 
nothing  to  prevent  its  use  if  they  are  not.  The  Mahalanobis  distance  is  defined  as: 

df{x)^{x-m.)  Spix-m^)  (3.2) 

where  (x)  is  the  Mahalanobis  distance  (also  called  the  squared  distance)  of 
observation  x  to  the  center  (mean)  of  group  i.  is  the  mean  value  of  group  i. 

S~'  is  the  inverse  of  the  variance-covariance  matrix  of  X.  A  column  in  X  is 

P 

represented  by  x.  Notice, 

df(mj)  =  d](m,)  (3.3) 


This  is  the  Mahalanobis  distance  between  groups  i  and  j. 
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3.4  Discriminant  Analysis 


A  discriminant  analysis  is  used  to  classify  observations  into  two  or  more  groups. 
Discirminant  analysis  can  also  be  used  to  investigate  how  variables  contribute  to 
group  separation  [Minitab  1997]. 

There  are  two  types  of  discriminant  analysis  -  linear  and  quadratic  discriminant 
analysis.  With  linear  discriminant  analysis,  all  groups  are  assumed  to  have  the  same 
covariance  matrix.  Quadratic  discrimination  does  not  make  this  assumption  but  its 
properties  are  not  as  well  understood.  The  linear  discriminant  analysis  with  cross- 
validation  and  prior  probabilities  is  used  in  this  study. 

3.4.1  Linear  Discriminant  Analysis 

An  observation  is  classified  into  a  group  if  the  Mahalanobis  distance  of  observation  to 
the  group  center  (mean)  is  the  minimum.  Linear  discriminant  analysis  has  the 
property  of  a  symmetric  Mahalanobis  distance. 

3.4.2  Cross-Validation 

Cross-validation  is  a  technique  used  to  compensate  for  an  optimistic  apparent  error 
rate.  The  apparent  error  rate  is  the  percent  of  misclassified  observations.  The  cross- 
validation  routine  works  by  omitting  each  observation  one  at  a  time,  recalculating  the 
classification  fimction  using  the  remaining  data,  and  then  classifying  the  omitted 
observation. 

3.4.3  Prior  Probabilities 

Sometimes  if  the  prior  probabilities  are  known  or  can  be  estimated,  discriminant 
analysis  can  utilize  it  in  calculating  the  posterior  probabilities,  or  probabilities  of 
assigning  observations  to  groups.  With  the  assumption  that  the  data  have  a  normal 
distribution,  the  linear  discriminant  function  is  increased  by  ln(pj),  where  pi  is  the 
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prior  probability  of  group  i.  Because  observations  are  assigned  to  groups  according  to 
the  smallest  generalized  distance,  the  effect  is  to  increase  somewhat  the  posterior 
probabilities  for  a  group  with  a  high  prior  probability. 


3.5  Performance  Analysis 

An  approach  that  provides  information  about  the  overall  performance  of  a  diagnostic 
system  is  known  as  Receiver  Operation  Characteristic  (ROC)  analysis.  Such  curves 
were  first  applied  to  assess  how  well  radar  equipment  in  World  War  II  distinguished 
random  interference  (noise)  from  signals  truly  indicative  of  enemy  planes. 

When  a  diagnosis  is  made,  there  are  usually  two  types  of  errors  [Mia  1999]:  Type  1 
errors  are  when  samples  from  Class  1  are  assigned  to  Class  2.  Type  2  errors  are  when 
samples  of  Class  2  are  assigned  to  Class  1 .  If  the  normal  cases  are  called  Class  1  and 
the  abnormal  cases  are  called  Class  2,  then  the  Type  1  error  rate  is  called  the  False 
Positive  Fraction  (FPF)  and  Type  2  error  rate  is  called  the  False  Negative  Fraction 
(FNF).  Similarly,  the  percentage  of  correctly  assigned  samples  in  Class  1  is  called 
True  Negative  Fraction  (TNF)  and  the  percentage  of  correctly  assigned  samples  in 
Class  2  is  called  the  True  Positive  Fraction  (FPF).  See  Figure  25. 

Usually,  in  ROC  analysis,  performance  of  a  diagnostic  system  is  described  by  the 
indices  of  “sensitivity”  and  “specificity”,  where  “sensitivity”  can  be  expressed  as  the 
True  Positive  Fraction  (TPF)  and  “specificity”  by  the  True  Negative  Fraction  (TNF) 
of  a  diagnosis.  In  a  complimentary  way,  the  FNF  and  the  FPF  can  be  defined  as  FNF 
=  1-  TPF  and  FPF  =1-  TPF,  respectively.  Due  to  this  dependence,  it  is  only  necessary 
to  measure  one  pair  of  indices.  Frequently  TPF  and  FPF  are  used. 
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Figure  25:  The  binormal  model  for  ROC  analysis  [Frederick  2000] 

Figure  26a  indicates  10  different  decision  thresholds,  thresholds  LI,  L2, . . .  LIO.  For 
each  of  these  decision  thresholds  the  FP  and  TP  percentages  have  been  computed. 
These  10  combinations  (FP,  TP)  have  been  plotted  in  curve  a  in  Figure  27.  This 
graphical  representation  is  called  ROC  curve,  which  plots  the  TPF  as  a  function  of  the 
FPF. 

To  show  the  effect  of  more  or  less  overlap  of  the  two  distributions,  in  curves  b  and  c 
of  Figure  27,  the  same  two  hypothetical  distributions  are  used  but  they  are  shifted 
closer  to  each  other  and  farther  apart,  respectively.  The  effect  of  more  overlapping 
distributions  is  seen  in  ROC  curve  b  and  the  effect  of  less  overlap  is  seen  in  curve  c  of 
Figure  27.  The  less  the  histograms  overlap,  the  better  the  ROC  approaches  the  ideal 
point  of  (FP,  TP)  =  (0,100).  As  curves  bow  more  to  the  left,  they  indicate  greater 
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accuracy  (a  higher  ratio  of  true  positives  to  false  positives).  The  more  the  histograms 
overlap,  however,  the  more  the  ROC  approaches  the  diagonal  line  that  runs  from  the 
point  (FP,  TP)  =  (0, 0)  to  the  point  (FP,  TP)  -  (100, 100).  This  straight  line  would 
signify  that  the  diagnostic  test  had  50/50  odds  of  making  a  correct  diagnosis  (no 
better  than  flipping  a  coin). 

The  above  example  shows  that  the  performance  of  any  decision  model  is  primarily 
determined  by  the  discriminatory  power  of  the  features.  If  the  features  show  too  much 
overlap,  a  different  decision  threshold  does  not  help.  This  not  only  applies  to  models 
that  operate  on  one  feature  but  also  applies  to  models  that  use  several  features  at  a 
time.  Therefore,  the  principal  task  in  developing  a  decision  model  with  an  optimal 
performance  is  finding  the  most  discriminating  features. 
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Figure  26:  Two  normal  distributions  with  10  different  decision  thresholds  LI,  L2, .. . 
L10[ROC  2001] 
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Figure  27:  ROC  curve  [ROC  2001] 


The  accuracy  is  indexed  more  precisely  by  the  amount  of  area  under  the  curve,  which 
increases  as  the  curves  bend  to  the  ideal  point  of  (FP,  TP)  =  (0,100).  A  rough  guide 
for  classifying  the  accuracy  of  a  diagnostic  test  is  the  traditional  academic  point 
system: 


Data  Range 

Accuracy  of  Classification 

.90-1 

Excellent 

ON 

00 

\ 

o 

00 

Good 

.70  -  .79 

Fair 

.60 -.69 

Poor 

.50 -.59 

Fail 

Table  4:  Characterizations  of  Az 
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If  the  accuracy  is  acceptable,  we  can  select  a  threshold  for  yes/no  diagnoses.  The  goal 
is  to  choose  a  threshold  that  yields  a  good  rate  of  true  positives  without  generating  an 
unacceptable  rate  of  false  positives.  Each  point  on  the  curve  represents  a  specific 
threshold  moving  from  the  strictest  at  the  top  right  to  the  most  lenient  at  the  bottom 
left.  Strict  thresholds  limit  false  positives  at  the  cost  of  missing  many  cases  of  cancer; 
lenient  thresholds  maximize  discovery  of  the  cases  of  cancer  at  a  cost  of  many  false 
positives. 

One  popular  software  package  for  performing  ROC  curve-fitting  and  statistical 
analysis  is  the  ROCKIT  software  developed  by  a  research  group  led  by  Charles  Metz 
at  the  University  of  Chicago,  which  uses  the  LABROC4  algorithm  [Metz  1998]. 

An  example  ROC  curve  drawn  by  using  ROCKIT  is  shown  in  Figure  28.  It  shows  the 
performance  is  0.7663  ±  0.0659  (area  under  the  ROC  curve)  by  only  using  a  single 
feature  -  entropy. 
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Figure  28:  A  sample  Receiver 


Chapter  4  System  Implementation 
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This  chapter  describes  the  procedure  for  acquiring  ultrasound  radio  frequency  (RF) 
data  and  for  composing  a  pathology  image,  and  the  control  of  the  saturation  ratio  of 
the  RF  data.  Finally,  a  detailed  description  of  the  development  and  usage  of  the 
software  is  given. 

4.1  RF  Data  Acquire 


Using  the  linear  array  transducer,  in  vitro  ultrasound  data  are  acquired  from  fresh 
radical  prostatectomy  specimens  that  are  obtained  immediately  after  resection  at 
Fletcher  Allen  Health  Care.  The  whole  prostate  gland  is  transported  from  the 
operating  suite  to  the  ultrasound  instruments.  The  specimen  is  immediately  immersed 
in  sterile  isotonic  saline  solution  in  a  tank  with  sound  absorbing  walls.  The  prostate  is 
oriented  so  that  the  plane  of  the  base  of  the  gland  is  vertical  and  the  posterior 
(peripheral  zone)  is  at  the  top  of  the  tank.  The  gland  is  then  scanned  using  a  Diasonics 
Spectra  real-time  scanner  with  a  5  MHz  center-frequency  scanhead.  The  scan  planes 
are  parallel  to  the  base  of  the  gland  and  are  taken  in  the  tranverse  plane  with  a 
clamped  transducer.  The  scan  planes  are  taken  at  2  mm  intervals.  The  ultrasonic  RF 
signal  is  digitized  (8  bits)  at  48  MHz  using  a  LeCroy  digitizer.  Figure  29  shows  the 
experimental  scan  setup.  Figure  30  is  the  enlarged  image  of  the  pad  and  the  glue  that 
provides  a  tight  contact  and  an  air-free  seal  between  the  transducer  and  the  specimen. 
The  air-free  seal  is  necessary  for  the  transducer  to  send  the  high  frequency  soimd  into 
the  object  imimpeded. 

After  the  conclusion  of  the  prostate  RF  data  acquisition,  RF  data  from  a  special 
phantom,  whose  frequency-dependent  attenuation  and  backscatter  properties  are 
known,  is  collected  using  the  same  machine  settings.  These  data  are  used  for 
calibration  of  the  prostate  RF  data. 
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Figure  29:  In  vitro  experimental  scan  setup 


Figure  30:  Pad  and  glue 


4.2  Saturated-ratio  Control  Software 


When  the  ultrasound  radio  frequency  data  are  acquired,  one  problem  needs  to  be 
considered.  Because  the  LeCroy  digitizer  samples  the  data  in  8  bits,  the  data  range  is 
from  -128  to  127.  If  the  maximum  value  of  the  data  is  outside  of  this  range,  which 
means  that  data  is  “saturated”,  some  important  information  is  missed.  The  saturated- 
ratio  is  defined  as  the  number  of  data  outside  the  range  of -128  to  127  over  the  total 
number  of  the  data.  Usually  the  saturated-ratio  of  “good”  data  should  be  pretty  close 
to  0%,  which  means  few  data  is  outside  the  range  of -128  to  127. 

An  example  of  saturated  RF  data  appears  in  Figure  3 1 .  The  upper  image  shows  the 
ultrasound  image  of  the  radio  frequency  data.  The  lower  image  shows  one  line  in  the 
radio  frequency  data  set.  The  saturated-ratio  of  this  example  is  10.31%,  which  means 
more  than  10%  of  the  data  are  outside  the  range  of -128  to  127.  Obviously,  many 
peaks  are  cut  off  and  some  important  information  is  lost.  Thus,  the  system  needs  to  be 
adjusted. 

It  should  be  noted  that  it  is  difficult  to  determine  if  the  RF  data  are  saturated  by 
merely  examining  the  B-mode  images.  In  order  to  minimize  saturation  problems,  a 
software  tool  has  been  developed.  The  software  helps  in  the  adjustment  of  the 
machine  settings  that  reduce  the  saturation-ratio  and  let  the  data  fall  into  the  range  of 
-128  to  127.  Once  the  saturation-ratio  is  pretty  close  to  0%,  the  settings  of  the 
machine  are  fixed.  The  result  after  adjustment  is  shown  in  Figure  32. 
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Figure  31:  Saturated  control  -  before  adjustment 
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4.3  PATH  Image  Composite 


After  ultrasound  data  acquisition,  the  prostatectomy  specimen  is  fixed  in  buffered 
formalin.  This  process  stiffens  the  tissue  so  that  less  deformation  occurs  during 
sectioning.  Next  the  specimen  is  sent  to  a  surgical  pathologist  for  examination.  The 
gland  is  sliced  into  multiple  transverse  sections  with  the  plane  of  the  sections  being 
perpendicular  to  the  posterior  surface  of  the  gland.  This  is  simply  achieved  by  placing 
the  posterior  surface  down  on  the  cutting  table  and  cutting  downward  vertically. 
Ideally,  these  slices  would  correspond  to  the  ultrasound  image  slices. 

After  the  transverse  sections  are  made  in  pathology,  each  section  is  fiirther  divided 
into  quarters  so  that  the  tissue  will  fit  on  a  standard  microscope  slide.  The  pathologist 
examines  the  section  quarters  and  all  foci  of  cancer  are  marked  on  the  glass  slide  with 
indelible  ink.  Then  the  slides  (quarter  sections)  are  digitized  for  reassembly  into 
complete  sections  (also  known  as  “whole  mount”  sections). 

The  glass  slides  are  simply  arranged  on  the  tray  of  a  transparency  flatbed  scanner  and 
“scanned”  in  at  a  resolution  of 200  -  400  dpi.  The  digitized  images  are  then  placed 
into  Adobe  Photoshop,  and  the  images  are  ‘  Varped”  slightly  to  fit  better  with  each 
other.  Warping  is  necessary  since  some  shrinkage  and  distortion  occurs  during  the 
sectioning  and  fixation  process.  Figure  33  shows  the  scanning  result  of  four-quarter 
sections  with  cancer  marked.  Figure  34  is  the  result  of  assembling  four-quarter 
sections  into  a  whole  slice. 

Since  the  cancer  area  was  marked  with  indelible  ink  on  the  pathology  image  by  the 
pathologist,  this  information  will  be  used  to  find  the  location  of  the  cancer  area  in  the 
ultrasound  image  and  in  the  ultrasound  RF  data  set  as  well.  The  details  can  be  found 
in  4.4.3. 
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Figure  33:  Pathology  quarter  sections  with  CA  marked 


Figure  34:  Quarter  sections  assembled  into  whole  slice 


4.4  User  Interface  Design 


Feature  computation  software  has  been  developed  for  the  in  vitro  prostate  tissue 
classification  project.  This  MATLAB  based  software  has  a  Windows  graphical  user 
interface  (GUI).  Figure  35  is  a  screen  shot  of  the  GUI.  When  the  user  chooses  one 
ultrasound  image  (left  image),  it  will  automatically  select  the  corresponding 
pathology  image  (right  image).  Then  the  user  draws  a  ROI  on  the  pathology  image 
and  the  software  maps  it  in  the  ultrasound  RF  data  set.  All  of  the  features  introduced 
in  Chapter  2  are  calculated  for  the  ROI.  The  result  is  stored  in  a  database.  This  section 
will  introduce  how  to  build  the  database,  how  to  load  the  PATH  image  for  the  US 
image  automatically,  how  to  locate  the  ROI  between  the  US  and  PATH  images,  as 
well  as  the  usage  of  this  software. 
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Figure  35:  Grjqphic  user  interface 
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4.4.1  Building  Database 

In  the  development  of  the  software,  according  to  specification,  it  was  necessary  to 
store  a  large  amount  of  different  types  of  information.  It  was  decided  to  use  a 
database  to  fulfill  this  requirement.  Microsoft  Access  2000  was  chosen  to  store  the 
information.  Software  was  developed  to  accomplish  the  data  processing  tasks  by 
using  the  database  tools  provided  by  MATLAB. 

The  first  development  step  was  to  decide  the  structure  of  the  database.  Generally, 
there  are  three  steps  related  with  the  database.  First,  it  is  necessary  to  put  some 
information  about  each  case,  such  as  the  ultrasound  filename  of  the  case,  the  number 
of  slices  in  the  case,  the  corresponding  pathology  filename  and  number,  and  the 
corresponding  phantom  filename.  Second,  when  this  tool  is  used  to  locate  the  region 
of  interest  (ROI)  inside  the  radio  frequency  data,  the  location  and  character  of  each 
ROI  needs  to  be  recorded.  Third,  it  is  necessary  to  store  the  features  for  each  ROI.  So 
three  tables  are  needed  in  the  database.  The  first  (patient)  stores  the  information  of 
each  case,  the  second  (roiinf)  saves  the  ROI  information,  and  the  third  (feature)  saves 
the  features  for  each  ROI.  The  structure  of  patient  and  roiinf  is  listed  in  Appendix  A. 

After  building  the  structure  of  the  database,  the  coimection  to  MS  Access  2000  had  to 
be  constructed.  This  can  be  done  via  Open  Database  Connectivity  (ODBC)  Data 
Source  Administrator,  which  is  in  the  Control  Panel  (the  operating  system  is  Window 
2000).  ODBC  is  a  widely  accepted  application  programming  interface  (API)  for 
database  access.  It  is  based  on  the  Call-Level  Interface  (CLI)  specifications  fi-om 
X/Open  and  ISO/IEC  for  database  APIs  and  uses  Structured  Query  Language  (SQL) 
as  its  database  access  language. 

Once  the  database  and  the  connection  to  the  database  are  built,  the  database  is  ready 
to  be  used  in  the  program  and  to  provide  the  services  we  needed. 
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4.4.2  Loading  PATH  Image  for  US  Image  Automatically 

When  processing  the  data,  the  specific  ultrasound  file  for  a  slice  in  one  case  is  loaded. 
The  corresponding  pathology  image  also  needs  to  be  loaded.  Most  of  the  time  this 
relation  is  not  a  simple  one  to  one  mapping.  Usually,  the  number  of  ultrasound  slices 
is  not  the  same  as  that  of  the  pathology  images.  Also  both  of  these  numbers  vary  with 
the  size  of  the  gland.  It  would  be  annoying  to  do  the  calculation  every  time  the  data  is 
processed.  The  strategy  to  handle  this  issue  is  to  put  these  information  in  the  database. 
There  is  one  table  in  the  database  called  patient,  whose  structure  can  be  found  in 
Table  14.  With  the  help  of  these  information,  when  one  ultrasound  file  is  loaded,  the 
corresponding  pathology  image  can  be  calculated  using  following  equation; 

j  =  round  (i  x  — )  (4- 1 ) 

»u 

where  i  and  j  are  the  position  in  the  ultrasound  and  pathology  slices,  np  and  nu  are  the 
numbers  of  ultrasound  and  pathology  slices  respectively. 

For  example,  if  15  slices  were  taken  using  ultrasound  and  only  12  were  taken  in 
pathology.  Once  the  third  ultrasound  slice  is  chosen,  according  the  above  equation, 
the  second  slice  in  the  pathology  images  will  be  loaded.  The  spacing  between  slices  is 
2mm  for  ultrasound  and  15/12x2mm=2.5mm  for  pathology.  The  third  ultrasound 
slice  is  at  a  position  within  the  gland  of  3x2=6mm.  The  second  pathology  slice  is  at  a 
position  of  2x2.5=5mm  within  the  gland. 

Since  the  position  of  the  slice  in  pathology  and  ultrasound  is  not  the  same,  this 
introduces  uncertainty  into  the  exact  pathology  slice  with  which  the  ultrasoimd  data 
should  be  correlated.  The  solution  to  this  problem  is  to  only  use  cancer  foci  that  are 
large  enough  to  appear  on  several  pathology  slices.  This  reduces  the  possibility  that 
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ultrasound  data  coming  from  normal  prostate  tissue  is  mistakenly  used  as  cancer  in 
the  data  processing. 

4.4.3  Locating  the  ROI  Between  the  US  and  the  PATH  Image 

When  one  ROI  is  specified  in  the  pathology  image,  the  corresponding  ultrasound  data 
needs  to  be  found.  The  features  for  that  ROI  are  then  calculated.  The  following  is  the 
details  for  how  to  map  the  ROI  between  ultrasound  and  pathology  images. 

First,  the  boundary  of  the  prostate  in  the  pathology  image  needs  to  be  specified.  The 
original  size  of  the  pathology  image  is  600  x  600.  This  is  shown  in  upper  right  of 
Figure  36.  When  the  pathology  image  is  loaded  into  the  GUI,  it  will  be  resized  to  256 
X  400  (the  size  of  the  image  axes)  (upper  middle).  Then  the  user  will  use  the  mouse  to 
draw  a  box  (shown  in  long  dash  dot)  inside  the  image  to  find  the  boundary  of  the 
gland,  which  is  represented  by  the  red  ellipse.  The  boundary  of  the  prostate  will  be 
resized  to  256  x  400  (upper  left). 

Second,  it  is  necessary  to  specify  the  boundary  of  prostate  in  ultrasound  image.  The 
original  size  of  ultrasound  image  matrix  is  256  x  320  (lower  right).  Due  to  the  size  of 
transducer,  if  the  size  of  gland  is  larger  than  the  transducer,  the  boundary  of  the  gland 
will  be  outside  of  the  image.  So  the  tool  should  allow  the  user  to  draw  a  box  outside 
of  the  original  image  region  when  indicating  the  boundary  of  the  prostate,.  To  solve 
this  problem,  the  size  of  the  image  matrix  was  increased  to  308  x  384  by  adding  some 
zeroes  around  the  original  image  matrix  (lower  middle).  Then  the  image  is  resized  to 
256  X  400.  With  the  help  of  radiologist,  the  boundary  of  the  gland  in  the  ultrasound 
image  is  contained  by  a  box  using  long  dash  dot.  The  boundary  of  the  gland  outside 
of  the  original  image  is  marked  by  red  dashs. 

Since  the  boundaries  of  gland  are  marked  on  both  the  ultrasound  and  the  pathology 
image,  the  next  step  is  to  draw  the  ROI  on  the  pathology  image  and  according  to  the 
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4.4.4  GUI  Usage 

Before  using  the  software  to  compute  the  tissue  features,  it  is  necessary  to  build  the 
ODBC  connection  to  the  Microsoft  Access  2000  and  input  the  needed  information 
into  the  database,  which  is  described  in  4.4.1. 

First,  one  ultrasound  slice  file  and  the  corresponding  pathology  image  are  loaded.  The 
software  automatically  turns  the  RF  data  into  an  ultrasound  image  and  loads  the  right 
pathology  image  according  to  the  information  in  the  database.  Notice  that  the  position 
of  scanning  and  cutting  the  prostate  are  in  opposite  directions.  The  pathology  image 
needs  to  be  rotated  1 80  degree  to  get  the  same  direction  as  the  scanning  image. 

Second,  find  the  boundary  of  the  prostate  on  both  images,  map  the  region  of  interest 
to  the  ultrasound  data  set,  and  save  all  the  position  information  in  the  database. 

Then  with  the  help  of  the  position  information,  all  of  the  features  are  calculated  for 
each  ROI.  The  results  are  saved  in  the  database. 

The  flowchart  of  the  usage  of  the  GUI  is  shown  in  Appendix  B. 
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Chapter  5:  Results 


The  parameters  described  in  Chapter  2  were  extracted  from  the  RF  ultrasound  signals 
by  using  the  tool  introduced  in  Chapter  4.  The  evaluation  methods  described  in 
Chapter  3  were  used  to  analyze  those  parameters.  The  results  of  that  analysis  are 
presented  in  this  chapter.  First,  the  case  material  and  features  are  summarized.  Then, 
the  results  of  feature  analysis  are  reported. 


5.1  Case  Material 


78  radical  prostatectomy  specimens  have  been  studied  so  far.  Unfortunately  not  all  of 
the  data  from  the  specimens  are  useful.  Many  of  the  samples  have  only  microscopic 
(l-2mm)  cancer.  Some  of  the  data  were  unusable  due  to  technical  problems  during 
acquisition.  The  current  preliminary  analysis  includes: 


Categories 

Number  of  ROIs 

Cancer 

36  (from  12  cancer  patients) 

Benign 

19  (from  benign  regions) 

Table  5:  Case  materials 


5.2  Features  Calculated 

The  parameters  listed  in  Table  6  were  extracted  from  all  of  the  ROIs  listed  in  Table  5. 
These  features  include  the  raw  RF  features,  which  manifest  the  microscopic 
information  on  scatter  size  and  acoustic  concentration  that  is  not  visually  accessible  in 
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images,  and  the  textures  features  from  the  co-occurrence  matrix,  which  carry  the 
information  on  macroscopic  tissue  architecture.  The  detail  of  each  feature  has  been 
introduced  in  Chapter  2. 


Groups 

Features 

RF 

Backscatter  vs.  Frequency  Slope 

Backscatter  Zero  Frequency  Intercept 

Backscatter  Mid  Band  Value 

Image  Statistics 

Signal  to  Noise  Ratio 

Image  Texture 

(Co-occurrence) 

Angular  Second  Moment 

Entropy 

Contrast 

Correlation 

Table  6:  List  of  features 


5.3  Feature  Analysis 

First,  the  ability  of  each  feature  to  separate  the  two  groups  was  assessed  by  using 
Student’s  t-test.  Then,  to  identify  the  combination  of  features  that  most  efficiently 
separated  normal  from  cancer  with  lowest  error  rates,  a  stepwise  discriminate  analysis 
was  employed. 

53.1  t-test 

The  t-test  assesses  whether  the  means  of  two  groups  are  statistically  different  from 
each  other.  Table  7  provides  the  results  for  the  mean  and  standard  deviation  (s.d.)  of 
the  features  for  the  two  groups  listed  in  Table  5.  It  is  usual  to  say  that  p-levels  <  0.05 
are  statistically  significant.  The  p-level  of  ENT  is  the  smallest  one,  which  is  0.00016 
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and  far  less  than  0.05.  It  indicates  that  the  mean  of  ENT  is  significantly  different 
between  benign  and  cancer  regions.  This  is  a  positive  sign  that  ENT  is  one  of  the  best 
features  for  separating  the  two  groups. 


Feature 

CA  Mean  (±s.d.) 

Benign  Mean  (Is.d.) 

t-test  p  value 

Slope 

0.778±.348  dB/MHz 

.5881.326 

0.05224 

Intercept 

-11.81±2.19dB 

-10.4112.20 

0.03160 

Mid  Band 

-7.88±1.48dB 

-7.4211.57 

0.29304 

SNR 

1.621.38 

1.361.35 

0.01583 

ASM 

.00991.018 

.00291.0026 

0.03089 

ENT 

-5.411.04 

-6.421.78 

0.00016 

CON 

41 1 113872 

206611049 

0.00482 

COR 

-.74691.115 

-.80431.0788 

0.03420 

Table  7:  Results  for  the  mean,  standard  deviation  and  t-test 

The  results  are  also  plotted  in  the  bar  chart  with  error  bars  in  Figure  37.  The  height  of 
a  bar  represents  the  mean  value  for  that  group.  The  error  bar  shows  the  95% 
confidence  limits  for  each  mean.  Obviously,  ENT  is  the  most  significant  feature. 
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Cancer 


Benign 


Cancer 


Figure  37:  Bar  chart  of  each  feature 


5.3.2  Correlation 

A  correlation  matrix,  as  described  in  Chapter  3,  was  computed  for  the  set  of  eight 
features.  The  correlation  matrix  is  symmetric  about  the  major  axis,  so  the  upper  right 
portion  of  the  matrix,  which  is  not  shown,  is  a  mirror  image  of  the  lower  left  portion 
of  the  matrix.  A  positive  coefficient  indicates  the  values  of  variable  A  vary  in  the 
same  direction  as  variable  B.  A  negative  coefficient  indicates  the  values  of  variable  A 
and  variable  B  vary  in  opposite  directions. 


SNR 

Slope 

Intercept 

ENT 

ASM 

CON 

COR 

Midband 

SNR 

— 

Slope 

.032 

.816 

— 

Intercept 

-•064 

.641 

-.767 

.000 

— 

ENT 

.377 

.088 

-.281 

— 

.005 

.522 

.037 

ASM 

.349 

.074 

-.109 

.716 

— 

.009 

.592 

.430 

.000 

CON 

.595 

.069 

-.141 

.639 

.852 

— 

.000 

.069 

.306 

.000 

.000 

COR 

.776 

.045 

-.155 

.284 

.094 

.498 

— 

.000 

.746 

.258 

.035 

.493 

.000 

Midband 

-.138 

-.025 

.647 

-.350 

-.103 

-.183 

-.242 

— 

.316 

.858 

,000 

.009 

.453 

.180 

.075 

Cell  Contents:  Pearson  correlation 
P-Value 


Table  8:  Correlation  matrix  of  the  eight  candidate  features 
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From  Chapter  3  we  know  that  when  r  is  between  0.70  and  0.89,  it  indicates  a  high 
correlation.  Using  a  correlation  value  of  0.70  as  the  threshold  for  redundant  features, 
there  are  four  pairs  of  featmes  that  meet  that  criterion.  From  the  table  above,  we 
know  the  ENT  and  ASM  features  have  a  correlation  coefficient  of  0.716;  the  slope 
and  intercept  features  have  a  correlation  coefficient  of -0.767;  the  CON  and  ASM 
features  have  a  correlation  coefficient  of  0.852;  the  COR  and  SNR  features  have  a 
correlation  coefficient  of  0.776.  So  some  features  can  be  eliminated  without 
significant  loss  of  information.  The  decision  of  which  feature(s)  to  be  eliminated  can 
be  aided  by  computing  the  Mahalanobis  distance  for  each  feature. 

5.3.3  Mahalanobis  Distance 

The  Mahalanobis  distance  is  a  measure  of  the  separation  between  the  means  of  a 
feature  computed  for  the  two  classes.  While  a  low  value  does  not  necessarily  mean  a 
feature  provides  no  separation  between  the  two  classes  (separation  may  still  be 
provided  by  using  a  quadratic  or  other  more  complex  classifiers.).  A  high  value  is  a 
good  indication  that  the  feature  will  provide  good  separation.  The  Mahalanobis 
distance,  as  described  in  Chapter  3,  is  presented  in  Table  9  for  each  of  the  eight 
features. 


Feature 

Mahalanobis  Distance 

SNR 

0.4855 

Intercept 

0.4028 

Slope 

0.3097 

Mid-band 

0.0949 

ENT 

1.1430 

COR 

0.3036 

CON 

0.4073 

ASM 

0.2173 

Table  9:  Mahalanobis  distance  of  each  of  the  eight  features 
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Of  the  four  pairs  of  features  identified  as  having  a  high  correlation,  the  ENT  feature 
has  the  largest  Mahalanobis  distance.  The  feature  ASM,  highly  correlated  with  ENT, 
is  eliminated  fi-om  the  analysis.  The  other  features  are  retained  for  further  analysis. 
This  leaves  seven  features  fi-om  which  to  select  feature  combinations  that  perform 
well. 

The  Mahalanobis  distance  for  each  of  the  two-feature  combinations  is  computed  in 
Table  10.  Notice  that  only  the  pairs  including  feature  ENT  have  a  value  larger  or 
equal  to  the  value  of  1.143.  All  of  the  Mahalanobis  distances  of  the  other  two-feature 
combinations  are  less  than  1.143,  which  also  means  their  classification  performances 
are  not  better  than  using  the  signal  feature  ENT.  Since  the  Mahalanobis  distance  of 
ENT  vs.  midband  and  ENT  vs.  CON  is  almost  the  same  as  1.143,  and  ENT  vs.  ASM 
has  a  high  correlation,  the  most  promising  pairs  of  two- feature  combinations  are  ENT 
vs.  SNR,  ENT  vs.  intercept,  ENT  vs.  slope  and  ENT  vs.  COR. 

We  added  one  more  feature  to  the  above  promising  pairs  to  see  if  any  improvement 
introduced.  The  Mahalanobis  distance  of  those  combinations  are  shown  in  Table  1 1 . 
The  largest  Mahalanobis  distance  value  is  generated  by  the  combination  of  ENT, 
slope  and  SNR,  which  is  1.700. 
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SNR 

Intercept 

Slope 

Midband 

ENT 

COR 

CON 

ASM 

SNR 

— 

Intercept 

0.918 

— 

Slope 

0.840 

0.418* 

— 

Midband 

0.544 

0.419 

0.410 

— 

ENT 

1.319 

1.352 

1.498 

1.144 

— 

COR 

0.487* 

0.651 

0.628 

0.342 

1.266 

— 

1^1 

CON 

0.577 

0.765 

0.723 

0.454 

1.143 

.492 

— 

■ 

ASM 

0.558 

0.594 

0.518 

0.493 

.426* 

B 

*  indicates  the  two  items  have  a  high  eorrelation 


Table  10:  Mahalanobis  distance  of  two  features  combination 


SNR 

Intercept 

Slope 

Midband 

COR 

CON 

ASM 

ENT&SNR 

— 

♦ 

ENT&Intercept 

1.564 

— 

♦ 

ENT&Slope 

1.700 

1.498* 

— 

♦ 

ENT&Midband 

1.321 

1.498 

1.500 

— 

ENT&COR 

1.323* 

1.459 

1.629 

1.275 

— 

B 

ENT&CON 

1.377 

1.353 

1.499 

1.144 

1.292 

t  >  ♦ 

♦ 

Table  11:  Mahalanobis  distance  of  three  features  combination 
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5.4  Classification  Results 


The  linear  discriminant  analysis  with  cross  validation  are  used  to  classify 
observations  into  two  groups.  This  analysis  will  use  the  “leave  one  out”  method 
[Lachenbruch,  1968]  and  a  linear  Bayers  classifier  to  generate  2x2  contingency 
tables  at  multiple  arbitrary  decision  threshold  levels  for  cancer  vs.  each  of  the  benign 
cases.  These  results  were  used  to  compute  sensitivities  and  specificities  and  to 
produce  an  ROC  curve  with  the  help  of  ROCKIT  software.  The  classification 
performance  was  measured  by  the  area  under  the  ROC  curve  (Az). 


Feature  Combination 

Performance  Az±  s.d. 

ENT 

SNR 

0.8219  ±  0.0583 

ENT 

Intercept 

0.8114  ±0.0708 

ENT 

Slope 

0.8090  ±  0.0660 

ENT 

Midband 

0.7648  ±0.07681 

ENT 

COR 

0.8386  ±  0.0567 

ENT 

CON 

0.7778  ±  0.0675 

Table  12:  Performance  of  the  two-feature  combinations 

The  classification  performance  of  the  six  two-feature  combinations  discussed  above  is 
shown  in  Table  12.  The  corresponding  ROC  curves  are  shown  in  Figure  38. 

When  Az  is  between  0.80  and  0.89,  the  accuracy  of  the  classification  is  good.  The 
threshold  level  was  set  at  Az  >  0.82  to  identify  feature  combinations  that  yielded  good 
classification  performance.  No  single  feature  provided  classification  performance 
above  this  level.  The  best  classification  performance  achieved  by  single  feature  is 
0.77  ±  0.07  that  was  provided  by  ENT  as  shown  in  Figure  28.  There  are  only  two 
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Figure  38:  ROC  curves  of  two-feature  combination 

two-feature  combinations  that  provided  classification  performance  of  Az  >  0.82.  They 
are  ENT  vs.  SNR  and  ENT  vs.  COR.  The  combination  of  ENT  and  COR  provided  the 
best  classification  performance  among  two-feature  combinations,  which  is  0.84  ± 
0.06.  The  scatter  plots  of  the  best  two-feature  combinations  are  also  shown  in  Figure 
39  and  Figure  40  separately. 
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1  2  3 

SNR 


Figure  39:  Scatter  plot  of  the  ENT  and  SNR 


-0.9  -0.8  -0.7  -0.6  -0.5  -0.4 

COR 


Figure  40:  Scatter  plot  of  the  ENT  and  COR 
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Feature  Combination 

Performance  Az  ±  s.d. 

ENT  SNR  Intercept 

0.8256  ±0.0612 

ENT  SNR  Slope 

0.8541  ±  0.0542 

ENT  SNR  Midband 

0.7814  ±0.0628 

ENT  SNR  CON 

0.8015  ±0.0623 

ENT  Intercept  Midband 

0.7937  ±  0.0695 

ENT  Intercept  COR 

0.7908  ±  0.0646 

ENT  Intercept  CON 

0.7796  ±  0.0679 

ENT  Slope  Midband 

0.7755  ±0.0713 

ENT  Slope  COR 

0.8144  ±0.0604 

ENT  Slope  CON 

0.7926  ±  0.0693 

ENT  Midband  COR 

0.7444  ±  0.0693 

ENT  Midband  CON 

0.7476  ±  0.0677 

ENT  COR  CON 

0.8071  ±  0.0586 

Table  13:  Performance  of  the  three-feature  combinations 

The  performance  of  all  the  three-feature  combinations  was  shown  in  Table  13.  There 
are  only  two  three-feature  combinations  that  provided  classification  performance  of 
Az  >  0.82.  The  combination  of  ENT,  SNR  and  slope  provids  the  best  classification 
performance  among  three-feature  combinations,  which  is  0.85  ±  0.05.  Notice  that  this 
combination  also  has  the  largest  Mahalanobis  distance  (Table  11).  The  scatter  plots  of 
the  two  best  three-feature  combinations  are  also  shown  in  Figure  42  and  Figure  43 
separately. 
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5.5  Parametric  Image 


In  addition  to  the  development  of  algorithms  for  the  computation  of  features  from  a 
user  defined  region  of  interest,  some  effort  has  been  directed  at  developing  software 
to  automatically  calculate  features  from  multiple  regions  of  interest  over  an  entire  RF 
data  set  (image)  in  order  to  produce  a  parametric  image  for  each  slice  corresponding 
to  the  B-mode  image.  A  preliminary  version  of  the  software  has  been  completed  and 
some  initial  parametric  images  have  been  produced  (Figure  44). 

When  we  choose  the  size  of  the  region  of  interest  (ROI),  there  will  be  the  classical 
fundamental  trade-off  between  resolution  and  sensitivity.  That  is,  the  small  ROIs 
mean  great  resolution  -  but  very  great  noise  (no  sensitivity).  And  large  ROIs  mean 
very  good  sensitivity,  but  very  poor  resolution.  To  generate  the  parametric  image,  we 
will  scan  the  ROI  throughout  each  image.  This  can  be  done  in  two  ways: 

(1)  “Scroll”  the  ROI  in  an  overlapping  way  -  that  is,  analyze  the  first  ROI  starting 
from  one  comer;  then  move  over  some  fraction  of  an  ROI  and  analyze  the  new  ROI  — 
which  will  overlap  the  first  one.  Continue  on  until  cover  the  whole  image.  The 
advantage  is  that  it  leads  to  nice  parametric  images. 

(2)  “Scroll”  the  ROI  in  a  non-overlapping  way.  It  is  the  same  as  above,  but  without 
overlapping.  This  will  give  independent  ROIs.  But  the  disadvantage  is:  if  we  want  to 
make  a  parametric  image  of  the  results,  it  will  be  “blocky”. 

In  our  situation,  relatively  large  subregions  must  be  used  to  reduce  the  variance  of  the 
calculated  slope  values  and  use  of  overlapping  regions  is  a  method  of  increasing  the 
apparent  spatial  resolution  of  the  image  when  larger  subregions  are  necessary.  After 
choosing  the  size  of  the  ROI,  it  is  possible  to  calculate  the  desired  parameters  for  the 
ROI  and  to  “scroll”  the  ROI  in  a  specific  overlapping  or  non-overlapping  (0% 
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overlapping)  way  to  cover  the  whole  image  to  get  a  multi-dimensional  parametric 
matrix.  Then  the  parametric  matrix  is  transferred  into  grayscale  by  adding  an 
appropriate  scaling  factor  and  shown  as  a  grayscale  image.  The  images  shown  in 
Figure  44  demonstrate  the  higher  level  of  detail  afforded  by  using  overlapping  sub- 
regions. 


The  intend  is  to  display  parametric  images  during  the  development  process  to  confirm 
proper  operation  of  the  software.  However,  in  the  end,  the  parametric  data  set  will  be 
combined  with  elastographic  and  clinical  probability  data  to  produce  a  single  image 
in  which  overall  probability  of  cancer  (based  on  all  features)  is  displayed  for  clinical 
use.  It  is  likely  that  the  experience  gained  in  producing  these  intermediate  parametric 
images  will  help  to  better  display  the  final  result  -  a  parametric  image  where  cancer 
probability  is  the  parameter. 
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0%  overlap 


25%  overlap 


50%  overlap 


91.67%  overlap 


75%  overlap 


B-mode  image 


Figure  44:  Parametric  image  of  the  prostate  (case  48,  slice  11)  using  backscatter  slope 
feature  with  5  mm  x  12  RF  vector  regions  of  interest  (ROI)  with  various  degrees  of 
overlap  of  the  regions  of  interest. 
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Chapter  6  Conclusions 


6.1  Summary 

This  research  is  part  of  the  project  “Combining  Clinical,  Sonographic,  and 
Elastographic  Features  to  Improve  the  Detection  of  Prostate  Cancer”  (Award 
Number:  DAMD  17-99-1-1007)  and  focused  on  the  study  of  quantitative  sonographic 
tissue  characterization. 

First,  the  features  that  will  be  used  to  help  distinguish  cancers  from  benign  tissue  are 
selected.  RF  based  and  texture  features  are  chosen  because  they  stand  for  the 
microscopic  and  macroscopic  tissue  architecture  information  respectively. 

Also,  the  software  for  computation  of  ultrasound  based  tissue  features  has  been 
developed.  The  software  allows  the  user  to  locate  cancers  on  ultrasound  images  by 
comparison  with  corresponding  pathology  slices.  With  the  help  of  this  tool,  the  ROIs 
are  marked  and  the  position  information  of  each  ROI  is  stored  in  the  database.  This  is 
an  interactive  process.  Then  the  texture  and  RF  features  are  calculated  for  the  ROI 
(cancer  or  benign)  and  the  results  are  saved  in  database. 

Signal  to  noise  ratios  have  also  been  calculated  for  the  tissue  mimicking  phantom.  It 
should  exhibit  a  ///<t  value  very  close  to  1 .91 .  Our  result  demonstrates  that  sub- 
regions  calculated  from  within  the  phantom  do  in  fact  exhibit  a  ///<t  close  to  1.91. 
The  average  value  is  1.893. 

Then  the  next  step  is  to  search  for  the  best  individual  and  best  combinations  of 
features  for  discriminating  cancerous  from  noncancerous  tissue  based  on  ROC 
analysis.  Our  results  show  that  the  best  individual  feature  is  entropy.  The  best  two- 
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feature  combination  is  entropy  and  correlation.  And  the  best  three-feature 
combination  is  entropy,  slope  and  signal  to  noise  ratio.  The  classification 
performance  is  0.84  ±  0.06  and  0.85  ±  0.05  (area  under  receiver  operating 
characteristic  curve)  respectively.  The  preliminary  results  show  RF  and  envelope- 
detected  signal  analyses  are  diagnostically  useful  to  discriminate  prostate  tissue. 


6.2  Possible  Future  Work 


In  the  future,  this  research  can  be  extended  in  several  different  directions.  First,  much 
more  data  is  needed  to  enhance  the  robustness  of  the  techniques  and  the  estimation  of 
their  performance.  Next,  exploration  of  different  features  might  prove  to  be  more 
successful.  Elastography  results  and  PSA  information  will  be  incorporated  in  the 
software.  It  would  be  interesting  to  explore  the  ability  of  more  complex  classifiers 
(such  as  quadratic,  k-nearest  neighbor,  and  neural  network  classifiers)  of 
distinguishing  normal  tissues  from  cases  of  prostate  cancer. 

One  improvement  can  be  made  by  using  a  constant  size  of  ROI  to  eliminate  bias  in 
statistical  features  due  to  ROI  size.  Because  the  larger  ROI  sizes  lead  to  a  reduction  in 
measured  variance.  One  example  is  shown  in  Table  16.  It  may  have  affected  the 
clinical  results  since  the  size  of  cancer  ROIs  tended  to  be  small  and  that  of  benign 
ROIs  tended  to  be  larger.  The  solution  to  this  problem  is  to  divide  the  larger  ROI  into 
several  sub-regions  and  then  average  the  results. 

And  the  possibility  would  be  to  scan  the  prostate  specimens  with  curved  array  and  to 
develop  a  classifier  for  that  data. 


83 


1.' 


References 


Bourke,  P.  (2001),  “Distributions,” 

httD://astrononiv.swin.edu.au/pbourke/analvsis/ciistributions/,  URL  valid  as  of  Sep., 
17,  2001. 

CaP  (2001),  Prostate  Cancer  Definition, 

http://www.cancer-prostate.com/DefinitionsFramel  Source  1  .thm,  URL  valid  as  of 
June  12,  2001. 


Feleppa,  E.J.,  Kalisz,  A.,  Sokil-Melgar,  J.B.,  Lizzi,  F.L.,  Liu,  T.,  Rosado,  A.L.,  Shao, 
M.C.,  Fair,  W.R.,  Wang,  Y.,  Cookson,  M.S.,  Reuter,  V.E.,  Heston,  W.D.W.  (1996), 
“Typing  of  Prostate  Tissue  by  Ultrasonic  Spectrum  Analysis,” /££'£'  Transactions  on 
Ultrasonics,  Ferroelectrics,  and  Frequency  Control ,  Vol.  43,  No.4,  pp.  609-619. 

Feleppa,  E.J.,  et  al.  (1997),  “Ultrasonic  spectral-parameter  imaging  of  the  prostate,” 
Int.  J.  of  Imaging  Syst.  &  Technol  8,  1 1-25. 

Feleppa,  E.J.,  et  al.  (2001),  “Application  of  spectrum-analysis  and  neural-network- 
based  imaging  to  detection  and  treatment  of  prostate  cancer,”  International 
Symposium  on  Ultrasonic  Imaging  and  Tissue  Characterization  26‘^. 

Frederick,  E.D.  (2000),  “Computer  Aided  Diagnosis  of  Acute  Pulmonary  Embolism,” 
A  dissertation  of  Duke  University 

Garra,  B.S.,  Insana,  M.F.  (1989),  “Quantitative  ultrasonic  detection  and  classification 
of  diffuse  liver  disease:  comparison  with  human  observer  performance,”  Investigative 
Radiology  24: 1 96-203 . 

Garra,  B.S.,  Krasner,  B.H.,  Horii,  S.C.,  Ascher  S.,  Mun,  S.K.,  Zeman,  R.K.  (1993), 
“Improving  the  Distinction  Between  Benign  and  Malignant  Breast  Lesions:  The 
Value  of  Sonographic  Texture  Analysis,”  Ultrasonic  Imaging  15,  267-285. 

Garra,  B.S.,  Insana,  M.F.  (1994),  “Quantitative  ultrasonic  detection  of  parenchymal 
structural  change  in  diffuse  renal  disease,”  Investigative  Radiology  29:134-140. 

Garra,  B.S.  (1998),  “Combining  Clinical,  Sonographic,  and  Elastographic  Eeatures  to 
Improve  The  Detection  of  Prostate  Cancer,”  Rivised  Statement  of  Work. 

Garra,  B.S.  (2000),  “Combining  Clinical,  Sonographic,  and  Elastigraphic  Features  to 
Improve  the  Detection  of  Prostate  Cancer,”  Aimual  Report  of  University  of  Vermont 
and  State  Agricultural  College. 


84 


Hahn,  GJ.,  Shapiro,  S.S.  (1967),  “Statistical  Models  in  Engineering,”  John  Wiley  & 
Sons,  Inc.,  New  York. 

Huynen,  A.L.,  Giesen,  RJB.  (1994),  “Analysis  of  ultrasonographic  prostate  images 
for  the  detection  of  prostatic  carcinoma:  the  automated  urologic  diagnostic  expert 
system,”  Ultrasound  Med.  Biol.  20:1-10. 

Insana,  M.F.,  Garra,  B.S.,  Brown,  D.G.,  Shawker,  T.H.  (1986),  “Analysis  of 
Ultrasound  Image  Texture  via  Generalized  Rician  Statistics,”  Optical  Engineerin , 
Vol.  25,  No.  6,  pp.  743-748. 

Lachenbruch,  P.A.,  Mickey,  M.R.  (1968),  “Estimation  of  error  rates  in  discriminant 
analysis,”  Technometrics  10, 1-11. 

Lerski,  R.A.  (1988),  “Practical  Ultrasound,”  Oxford  University  Press,  Washington 
DC. 

Lizzi,  F.L.,  Ostromogilsky,  M.,  Feleppa,  E.J.,  Rorke,  M.C.,  Yaremko,  M.M.  (1986), 
“Relationship  of  Ultrasonic  Spectral  Parameters  to  Features  of  Tissue 
Microstructure,”  IEEE  Transactions  on  Ultrasonics.  Ferroelectrics.  and  Frequency 
Control.  Vol.  UFFC-34.  No.  3. 

MATLAB  Online  Help  (2000),  MathWorks,  Inc. 

Metz,  C.E.  (1998),  ROCKIT  Users  Guide. 

Mia,  R.S.  (1999),  “Classification  Performance  and  Reproducibility  of  New 
Parameters  for  Quantitative  Ultrasound  Tissue  Characterization,”  A  dissertation  of 
Johns  Hopkins  University. 

Minitab  User’s  Guide  (1997),  Minitab,  Inc. 

Mohanty,  Nirode.  (1987),  “Signal  Processing,”  Van  Nostrand  Reinhold  Company 
Inc.,  New  York. 

NCSU  (2001),  Medical  Ultrasound  Figure, 

http://www5.bae.ncsu.edu/bae/research/blanchard/www/465/textbook/imaging/proiec 
ts/ultrasound/proiect/drawing.html,  URL  valid  as  of  August  20, 2001 . 

PROACT  (2001),  The  PSA  Test, 

http://www.prostateaction.org/acticles/psatesthowdoesitwork.html,  URL  valid  as  of 

June  12, 2001. 

Prostate  (2001),  Introduction  to  Prostate  Cancer, 


85 


* 


http://website.lineone.net/~prostate/intro.html,  URL  valid  as  of  June  12, 2001 . 

ROC  (2001),  Handbook  of  Medical  Informatics 

http://www.mieur.n1/mihandbook/r  3  3/booktext/booktext  15  04  01  02o.htm,  URL 
valid  as  of  August  20, 2001. 

Trochim,  W.M.  (2001),  Research  Methods  Knowledge  Base, 
http://trochim.human.comell.edu/kb/statsimp.htm,  URL  valid  as  of  August  20, 2001 . 

Wagner,  R.F.,  Smith,  S.W.,  Sandrik,  J.M.,  Lopez,  H.  (1983),  “Statistics  of  Speckle  in 
Ultrasound  B-Scans,”  IEEE  Transactions  on  Sonics  and  Ultrasonics,  Vol.  30,  No.  3, 
pp.  156-163. 

Wagner,  R.F.,  et  al.  (1986),  “Analysis  of  ultrasound  image  texture  via  generalized 
Rician  statistics,”  Optical  Engineering,  Vol.25,  No.6,  pp.  743-748. 

Wagner,  R.F.,  et  al.  (1987),  “Statistical  properties  of  radio-frequency  and  envelope- 
detected  signals  with  applications  to  medical  ultrasound,”  Optical  Society  of 
America,  Vol.4,  No.  5,  pp.  910-922. 

Wear,  K.A.,  Garra,  B.S.,  Hall,  T.J.  (1995),  “Measurements  of  ultrasonic  backscatter 
coefficients  in  human  liver  and  kidney  in  vivo,”  Acoustical  Society  of  America,  Vol. 
98,  No.  4,pp.  1852-1857. 


86 


Appendix  A:  Database  Structure 


Field  Name 

Data  Type 

PatientID 

Number 

USFileName 

Number 

USFileNumber 

Number 

PATHFileName 

Number 

PATHFileNumber 

Number 

Phantom 

Number 

StartingDepth 

Number 

StoppingDepth 

Number 

Table  14:  Table  structure  -  patient 


Field  Name 

Data  Type 

Status 

Number 

USSliceFileName 

Number 

ROINum 

Number 

ROICharacter 

Text 

RODC 

Number 

ROIY 

Number 

ROIWidth 

Number 

ROIHeight 

Number 

Table  15:  Table  structure  -  roilnf 
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Appendix  B:  Flowchart  of  GUI  Usage 


Figure  45:  Flowchart  of  GUI  usage 
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Appendix  C:  Bias  Caused  by  ROI  Size 


To  demonstrate  the  bias  introduced  by  ROI  size,  SNR  is  calculated  for  different  sizes 
of  ROIs.  There  are  12  ROIs  in  each  size.  Notice  that  the  larger  ROIs  contain  the 
smaller  ones.  The  result  shows  that  the  larger  ROI  has  lower  variance. 


ROI  Size 

S.D. 

20x25 

0.250 

30x38 

0.132 

35x44 

0.096 

40x50 

0.085 

45x56 

0.084 

Table  16:  Bias  caused  by  ROI  size 


Appendix  D:  Backscatter  Coefficient  of  Reference  Phantom 


Frequency  (Hz)  Backscatter  Coefficient 


3.125000e06 
3.222656e06 
3.3203 13e06 
3.417969e06 
3.515625e06 
3.613281e06 
3.710938e06 
3.808594e06 
3.906250e06 
4.003906e06 
4.101563e06 
4.199219e06 
4.296875e06 
4.39453  le06 
4.492 188e06 
4.589844e06 
4.687500e06 
4.785 156e06 
4.8828 13e06 
4.980469e06 
5.078 125e06 
5.175781e06 
5.273438e06 
5.371094e06 
5.468750e06 
5.566406e06 
5.664063e06 
5.761719e06 
5.859375e06 
5.95703 le06 
6.054688e06 
6.152344e06 
6.250000e06 
6.347656e06 
6.4453 13e06 
6.542969e06 
6.640625e06 
6.738281e06 
6.835938e06 
6.933594e06 
7.031250e06 


2.1612715E-04 
2.4848658E-04 
2.7688060E-04 
3.0678869E-04 
3.5969596E-04 
4.3010002E-04 
4.287801  OE-04 
4.7800399E-04 
6.4783212E-04 
5.2049971E-04 
7.9972047E-04 
6.8410835E-04 
7.861 7007E-04 
1.0446353E-03 
9.0188358E-04 
1.3031829E-03 
1.2099128E-03 
1.1688722E-03 
1.4217630E-03 
1.2657929E-03 
1.5340993E-03 
1.6429626E-03 
1.6357866E-03 
2.2317523E-03 
2.0040637E-03 
2.073991 8E-03 
2.3677654E-03 
2.2838814E-03 
2.6224537E-03 
2.8915587E-03 
3.0219131E-03 
3.4578224E-03 
3.2285405E-03 
3.4586722E-03 
4.0386640E-03 
3.92085 17E-03 
3.938273  lE-03 
4.0659769E-03 
3.95481  lOE-03 
4.0240278E-03 
4.4389609E-03 
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Appendix  E:  Phase  Shifting  with  the  Hilbert  Transform 


m 

//(/)  = -7Sgn(/)  where  sgn(/)  =  <  0, 

^1, 

A.  A  cosine  was  transformed  into  a  sine: 

JC|  (t)  =  cos  w^t 


f<0 
f  =  0 
/>0 


(0  =  X,  (t)  0  hit)  «  X,  if)  =  X,  if)Hif) 

^1  (/)  =  TtidilTf  -  Wo  )  +  Siljf  +  Wo  ))//(/) 

=  ni-jdilrf  -  Wo  )  +  jSilnf  +  Wo )) 

=  TgiSilTf  +  Wo )  -  dilTf  -  Wo )) 

=  —iSilrf  -  Wo)  -dilTf  +  Wo))  <=>  X,  it)  =  sin  w^t 

j 


B.  A  sine  was  transformed  into  a  minus  cosine: 


Xj  it)  =  sin  Wgt 

x,(i)=x,(i)'Sh(i)  o  !,(/) = 

X,(/)  =  -(S(2,f-w„)-S(2,f  +  w„)mr> 

J 

J 

=  -fciSilTf  +  Wq  )  +  SilTf  -  Wo  ))  o  Xj  (0  =  -  cos  Wo  it) 
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